Galerkin approximation of linear problems in Banach and Hilbert spaces
Wolfgang Arendt, Isabelle Chalendar (LAMA), Robert Eymard (LAMA)

TL;DR
This paper analyzes the convergence of Galerkin methods for linear problems in Banach and Hilbert spaces, establishing necessary and sufficient conditions and characterizing forms that guarantee convergence.
Contribution
It provides a comprehensive characterization of Galerkin approximation convergence and identifies the forms that ensure universal convergence in Hilbert spaces.
Findings
Necessary and sufficient condition for Galerkin convergence
Characterization of forms with universal Galerkin property
Optimal a priori estimates for coercive forms
Abstract
In this paper we study the conforming Galerkin approximation of the problem: find u U such that a(u, v) = <L, v> for all v V, where U and V are Hilbert or Banach spaces, a is a continuous bilinear or sesquilinear form and L V' a given data. The approximate solution is sought in a finite dimensional subspace of U, and test functions are taken in a finite dimensional subspace of V. We provide a necessary and sufficient condition on the form a for convergence of the Galerkin approximation, which is also equivalent to convergence of the Galerkin approximation for the adjoint problem. We also characterize the fact that U has a finite dimensional Schauder decomposition in terms of properties related to the Galerkin approximation. In the case of Hilbert spaces, we prove that the only bilinear or sesquilinear forms for which any Galerkin approximation converges (this property…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\definecolor
labelkeyrgb0.6,0,1
\definecolorvioletrgb0.580,0.,0.827
Galerkin approximation of linear problems
in Banach and Hilbert spaces
W. Arendt
Wolfgang Arendt, Institute of Applied Analysis, University of Ulm. Helmholtzstr. 18, D-89069 Ulm (Germany)
,
I. Chalendar
Isabelle Chalendar, Université Paris-Est, LAMA, (UMR 8050), UPEM, UPEC, CNRS, F-77454, Marne-la-Vallée (France)
and
R. Eymard
Robert Eymard, Université Paris-Est, LAMA, (UMR 8050), UPEM, UPEC, CNRS, F-77454, Marne-la-Vallée (France)
Abstract.
In this paper we study the conforming Galerkin approximation of the problem: find such that for all , where and are Hilbert or Banach spaces, is a continuous bilinear or sesquilinear form and a given data. The approximate solution is sought in a finite dimensional subspace of , and test functions are taken in a finite dimensional subspace of . We provide a necessary and sufficient condition on the form for convergence of the Galerkin approximation, which is also equivalent to convergence of the Galerkin approximation for the adjoint problem. We also characterize the fact that has a finite dimensional Schauder decomposition in terms of properties related to the Galerkin approximation. In the case of Hilbert spaces, we prove that the only bilinear or sesquilinear forms for which any Galerkin approximation converges (this property is called the universal Galerkin property) are the essentially coercive forms. In this case, a generalization of the Aubin-Nitsche Theorem leads to optimal a priori estimates in terms of regularity properties of the right-hand side , as shown by several applications. Finally, a section entitled ”Supplement” provides some consequences of our results for the approximation of saddle point problems.
Key words and phrases:
Galerkin approximation, sesquilinear coercive forms, approximation properties in Banach spaces, essential coercivity, universal Galerkin convergence
2010 Mathematics Subject Classification:
65N30,47A07,47A52,46B20
1. Introduction
Due to its practical importance, the approximation of elliptic problems in Banach or Hilbert spaces has been the object of numerous works. In Hilbert spaces, a crucial result is the simultaneous use of the Lax-Milgram theorem and of Céa’s Lemma to conclude the convergence of conforming Galerkin methods in the case that the elliptic problem is resulting from a coercive bilinear or sesquilinear form.
But the coercivity property is lost in many practical situations: for example, consider the Laplace operator perturbed by a convection term or a reaction term (see the example in Section 7.2), and the approximation of non-coercive forms must be studied as well. For particular bilinear or sesquilinear forms, the Fredholm alternative provides an existence result in the case where the problem is well-posed in the Hadamard sense. Such results have been extended by Banach, Nečas, Babuška and Brezzi in the case of bilinear forms on Banach spaces. The conforming approximation of such problems enters into the framework of the so-called Petrov–Galerkin methods, for which sufficient conditions for the convergence are classical (see for example the references [2, 8, 12, 31] which also include the case of non-conforming approximations).
Nevertheless, these sufficient conditions do not guarantee that for a given problem, there exists a converging Galerkin approximation. Moreover, they do not answer the following question, which is important in practice: under which conditions does the Galerkin approximation exist and converge to the solution of the continuous problem for any sufficiently fine approximation (for example, letting the degree of an approximating polynomial or the number of modes in a Fourier approximation be high enough, or letting the size of the mesh for a finite element method be small enough, and, in the case of Hilbert spaces, using the Galerkin method and not the Petrov–Galerkin method)?
The aim of this paper is precisely to address such questions for not necessarily coercive bilinear or sesquilinear forms defined on some Banach or Hilbert spaces (we treat the real and complex cases simultaneously). We shall restrict this study to conforming approximations, in the sense that the approximation will be sought in subspaces of the underlying space, using the continuous bilinear or sesquilinear form.
In the first part we consider the Banach space framework. Given a continuous bilinear form where and are reflexive, separable Banach spaces, one is interested in the existence and the convergence of the Galerkin approximation to , where is the solution of the following problem:
[TABLE]
where is given (the existence and uniqueness of are obtained under the Banach-Nečas-Babuška conditions, see for example [12, Theorem 2.6]). For approximating sequences , (see Section 2 for the definition), the Galerkin approximation of (1.1) is given by the sequence such that, for any , is the solution of the following finite dimensional linear problem:
[TABLE]
It is known that, if , the uniform Banach-Nečas-Babuška condition (BNB) given in Section 2 is sufficient for these existence and convergence properties (see for example [12, Theorem 2.24]). We show here that this condition is also necessary and, surprisingly, that the convergence of the Galerkin approximation of (1.1) is equivalent to that of the Galerkin approximation of the dual problem.
These two results seem to be new and are presented in Section 2.
In Section 3, we ask the following: given a form such that (1.1) is well-posed, do there always exist approximating sequences in and such that the Galerkin approximation converges? Surprisingly, the answer is negative (even though the spaces and are supposed to be reflexive and separable). In fact, such approximating sequences exist if and only if the Banach space has a finite dimensional Schauder decomposition, a property which is strictly more general than having a Schauder basis.
In the remainder of the paper, merely Hilbert spaces are considered and moreover we assume that and for all . Given is a continuous bilinear form , where is a separable Hilbert space. Assuming that (1.1) is well-posed, we show that the convergence of the Galerkin approximation for all approximating sequences in (which we call here the universal Galerkin property) is equivalent to being essentially coercive, which means that a compact perturbation of is coercive. This notion of essential coercivity can also be characterized by a certain weak-strong inverse continuity of , which, in fact, we take as definition of essential coercivity (Definition 4.2).
We then derive improved a priori error estimates by generalizing the Aubin–Nitsche argument to non-symmetric forms and also allowing the given right hand side of (1.1) to belong to arbitrary interpolation spaces in between and . These generalizations are applied to two cases: the approximation of selfadjoint positive operators with compact resolvent (in this case, it is seen that our a priori error estimate is optimal, with the fastest speed of convergence for in , the slowest for ) and the finite element approximation of a non-selfadjoint elliptic differential operator, including convection and reaction terms which is indeed essentially coercive.
We finally give some further historical remarks in Section 8, where we consider saddle point problems. As a consequence of our results, we show that Brezzi’s conditions, implying the convergence of mixed approximations (which are the Galerkin ones in the case of saddle point problems), are also necessary for this convergence.
To avoid any ambiguity, in the sequel, we let and .
The paper is organized as follows:
Contents
2. Petrov–Galerkin approximation
In this section we give a characterization of the convergence of Petrov–Galerkin methods, that, for short, we call Galerkin convergence. A basic definition is the following.
Definition 2.1** (Approximating sequences of Banach spaces).**
Let be a separable Banach space. An approximating sequence of is a sequence of finite dimensional subspaces of such that
[TABLE]
for all , where .
Now let and be two separable, reflexive Banach spaces over or and be a continuous sesquilinear form such that
[TABLE]
where is a constant. We assume that and are infinite dimensional and that and are approximating sequences of and respectively. We also assume throughout that
[TABLE]
Given we search a solution of the problem:
[TABLE]
Moreover we want to approximate such a solution by , the solution of the problem:
[TABLE]
Note that, given , there exists a unique satisfying (2.2) if and only if
[TABLE]
since, by assumption, and have the same finite dimension.
Let us briefly recall the origin of the Banach-Nečas-Babuška conditions for the well-posedness of (2.1) as stated for example in [12, 31, 2] (equivalent conditions are proposed in [8] in the case of Hilbert spaces). Let us consider the associated operator defined by
[TABLE]
Then is linear, bounded with . By the Inverse Mapping Theorem, has closed range and is injective if and only if there exists such that
[TABLE]
By the definition of the norm of , this can be reformulated by
[TABLE]
Recall that is invertible if and only if is injective and has a closed and dense range. By the Hahn-Banach theorem, has dense range if and only if no non-zero continuous functional on annihilates the range of . By reflexivity, this is equivalent to the following uniqueness property:
[TABLE]
Thus (2.1) is well-posed (i.e. for all there exists a unique satisfying (2.1)) if and only if (2.5) and (2.6) are satisfied. In fact, Hadamard’s definition of well-posedness also requires continuity of the inverse operator, which here automatically follows from bijectivity by the Inverse Mapping Theorem.
In order to obtain a result of convergence of the approximate solutions we consider the following uniform Banach-Nečas-Babuška condition (called Ladyzenskaia-Babuška-Brezzi condition in the framework of the mixed formulations, i.e. approximation of saddle point problems, see also Section 8), which is the estimate (2.5) for uniformly in , namely
[TABLE]
Remark 2.2**.**
Condition (BNB) is also called the inf–sup condition since by the Hahn-Banach Theorem it can be reformulated as
[TABLE]
*More precisely, this is the uniform or *discrete BNB-condition which is used for approximation whereas (2.5) is the continuous BNB-condition which expresses well-posedness of the problem and can also be expressed by an inf-sup-condition (see for example [15, Lemma 6.95 and Lemma 6.110]). The use of (LBB) relates this inequality to the work of Ladyzhenkaya [18] who, after a previous contribution due to Babuska [3], used it to prove well-posedness. Brezzi [4] introduced the analogue of the uniform BNB-condition for the treatment of saddle point problems (see Section 8 for more details).
Usually, in the numerical analysis community, one uses the name “inf-sup” condition (or LBB condition) only in the context of saddle point problems (see condition (8.1). in Section 8). We keep the name “(BNB) condition”, following the monograph [12].
We recall that (BNB) implies that the approximate solutions converge to the solution if the problem is well-posed (see for example [12, 31, 2]). Here we will show that (BNB) is actually equivalent to Galerkin-convergence, and surprisingly also to Galerkin-convergence for the dual problem.
Definition 2.3** (Convergence of Galerkin approximation).**
We say that the Galerkin-approximation converges if (2.1) as well as (2.2) are well-posed for all and and if, in addition, there exists a constant independent of and such that,
[TABLE]
where is the solution of (2.1) and the solution of (2.2) for and . In particular, in .
We may also consider the dual problem of (2.1) where is replaced by the adjoint form given by
[TABLE]
If in Definition 2.3 the form is replaced by , then we say that the dual Galerkin approximation converges. Similarly we note the following dual uniform Banach-Nečas-Babuška condition
[TABLE]
Then the following theorem holds.
Theorem 2.4**.**
The following assertions are equivalent:
- (i)
the Galerkin approximation converges; 2. (ii)
* holds;* 3. (iii)
* holds;* 4. (iv)
the dual Galerkin approximation converges.
It is surprising that and are equivalent even though the corresponding condition (2.5) is obviously not equivalent to its dual form. In fact, it can well happen that is injective and has closed range (so that there exists satisfying (2.5)) but the range of is a proper subspace of so that there exists such that and for all ; in particular the dual form of (2.5) does not hold for any .
We will give the proof of Theorem 2.4 in several steps which give partly even stronger results. At first we show that implies , where can even be expressed in terms of and . Although the proof of this result is classical (see for example [31, 12]), we provide it for the convenience of the reader, but also to establish the well-posedness of (2.1) which we did not assume. This will be important for the proof of Theorem 2.4 and for the main result in Section 5.
Proposition 2.5**.**
Let . Assume that for all ,
[TABLE]
Then the Galerkin-approximation converges and (2.7) holds with
[TABLE]
Proof.
Let . Note that implies (2.3). Thus, for each there exists a unique solution of (2.2). By ,
[TABLE]
Since is reflexive, we find such that a subsequence of , say, , converges weakly to . Let . By assumption we find such that . It follows that
[TABLE]
Thus we find a solution of (2.1). But so far we do not know its uniqueness. This will be a consequence of which we prove now. Indeed, observe that
[TABLE]
It follows that (Galerkin orthogonality). Using this, for all ,
[TABLE]
Taking the infimum over all we obtain (2.7). In particular which shows uniqueness. ∎
The following result is due to Xu and Zikatanov [31, Theorem 2] (see also [2, Satz 9.41]). We nevertheless provide its proof for the sake of completeness.
Proposition 2.6**.**
Assume that is a Hilbert space and that is such that (2.8) holds. Then the Galerkin-approximation converges and (2.7) holds with .
Proof.
Note that (2.8) implies (2.3). Consequently for each there exists a unique such that
[TABLE]
Then is a projection from onto , which is calle the Ritz projection. Moreover,
[TABLE]
Thus .
Since and , one has . It follows from a result due to Kato [17, Lemma 4] that .
Now let and the solution of (2.3), the solution of (2.2). Then for any ,
[TABLE]
Hence
[TABLE]
This implies that
[TABLE]
∎
Remark 2.7**.**
Also in certain Banach spaces an improvement of the constant is possible, see Stern [29].
Next we show that even a weaker assumption than the convergence of the Galerkin-approximation implies .
Proposition 2.8**.**
Assume (2.3) for all and that
[TABLE]
whenever and is the solution of (2.2). Then holds.
Proof.
Since the spaces and have the same finite dimension, our assumption (2.3) implies also dual uniqueness, i.e. for all implies whenever , and this for all . Thus
[TABLE]
defines a norm on . Moreover,
[TABLE]
We show that the set
[TABLE]
is bounded. For that purpose, let . By assumption there exist and such that
[TABLE]
and for all . Now, for ,
[TABLE]
This shows that is weakly bounded and thus, owing to the Banach–Steinhaus theorem, norm-bounded. Therefore there exists such that , i.e.
[TABLE]
This is . ∎
Proof of Theorem 2.4.
and via Proposition 2.5, whereas and follows from Proposition 2.8.
∎
Remark: The hypothesis on and to be reflexive is not needed in Proposition 2.5.
Finally we mention that the best lower bounds for and for are the same if and are Hilbert spaces.
Proposition 2.9**.**
Assuming that and are Hilbert spaces, let . Then the two conditions and are equivalent:
[TABLE]
[TABLE]
Proof.
Let and be given by
[TABLE]
Then
[TABLE]
where is the adjoint of . Moreover, since is invertible,
[TABLE]
for all if and only if . Since , it follows that and hence
[TABLE]
∎
W. V. Petryshyn, namely in Theorem 2 and 3 of [22], considers approximation of an operator equation by finite dimensional problems and characterizes strong convergence. However, besides in very special situations, it sems not possible to deduce from this convergence of a Galerkin approximation, formulated in terms of sesquilinear forms. Further results for operator equations and their approximation can be found in the monograph [25, p. 26 ff].
3.
Existence of a converging Galerkin approximation
In this section, we again let and be separable reflexive real Banach spaces and let be a continuous sesquilinear form such that the problem (2.1) is well-posed; i.e. for all there exists a unique satisfying (2.1). Since and are separable, there always exist approximating sequences of and of . Our question is whether there is a choice of these sequences which is adapted to the problem (2.1); i.e. such that the associated Galerkin approximation converges. We will show that the answer is related to the approximation property. In fact, different versions of this property play a role; we recall them in the next definition.
Definition 3.1** (Approximation property and Schauder decomposition).**
Let be a separable Banach space.
- a)
The space has the approximation property (AP) if, for every compact subset of and every , there exists a finite rank operator such that
[TABLE]
- b)
The space has the bounded approximation property (BAP) if there exists a sequence of finite rank operators in such that
[TABLE]
- c)
*The space has the bounded projection approximation property (BPAP) if each in b) can be chosen as a projection (i.e. such that ). *
- d)
The space possesses a finite dimensional decomposition if one finds as in c) with the additional property
[TABLE]
- e)
The space has a Schauder basis if d) holds with
[TABLE]
It is known that (BAP) is equivalent to (AP) if is reflexive. The first counterexample of a Banach space without (AP) has been given by Enflo [11]. He constructed a space which is even separable and reflexive.
Obviously the properties a)–e) have decreasing generality. It was Read [26] who showed that (BAP) does not imply (BPAP), even if reflexive and separable spaces are considered. Szarek [30] constructed a reflexive, separable Banach space having a finite dimensional Schauder decompositon but not a Schauder basis. Finally, it seems to be unknown whether (BPAP) implies the existence of a finite dimensional Schauder decomposition (see [24, Sec. 5.7.4.6] and [7, Problem 6.2]). However, if is reflexive and separable, then these two properties are equivalent by [7, Theorem 6.4 (3)]).
Concerning the notion of finite dimensional Schauder decomposition, there is an equivalent formulation, namely the existence of finite dimensional subspaces of such that for each there exist unique such that This explains the name. We refer to [20, Chapter I] , [7] for more information and to [24, Sec. 5.7.4] for the history of the approximation property. In the following theorem, by the hypothesis of well-posedness, the two Banach spaces and are isomorphic. For this reason they have the same Banach space properties.
Theorem 3.2**.**
Let and be separable reflexive Banach spaces and let be a continuous sesquilinear form such that (2.1) is well-posed. Then the following assertions are equivalent.
- (i)
There exist approximating sequences of and of such that the associated Galerkin approximation converges.
- (ii)
The space has the (BPAP).
- (iii)
The space has a finite dimensional Schauder decomposition.
Here convergence of the associated Galerkin approximation is understood in the sense of Definition 2.3.
Proof of Theorem 3.2.
Let . Then defines an element . By Definition 2.3, for each , there exists a unique such that
[TABLE]
Moreover, for all and some . In particular, . It follows from the definition that . Since , each has finite rank. We have shown that the space has the (BPAP).
See [7, Theorem 6.4 (3)].
Let be the operator defined by . Then is invertible. By hypothesis there exist finite rank projections such that for all . Let , be the solution of (2.1). Then
[TABLE]
We show that is obtained as a Galerkin approximation. In fact, fix . There exist , such that and
[TABLE]
for all . Since is reflexive there exist such that
[TABLE]
for all and . Define and . Now consider the given . Let . Then
[TABLE]
if and only if
[TABLE]
By (3.4),
[TABLE]
Therefore is the unique solution of (3.5). Again, by (3.4),
[TABLE]
and it follows from (3.2) that . This also implies that as . Thus the sequence is approximating.
It remains to show that the sequence is approximating in . For this we need the the additional property (3.1). Consider the adjoint of . Then weakly converges to as for all . Thus
[TABLE]
is weakly dense in . But, because of (3.1), is a subspace of . Thus, by Mazur’s Theorem, is dense in . If , then there exist such that . Thus
[TABLE]
for all by (3.1), and then for all . Since , it follows that for all . This implies that the sequence is approximating in . It follows from (3.4) that . In fact, fix and consider as in (3.3). Then (3.4) says that . Since is an approximating sequence in and is an isomorphism from to , it follows that is an approximating sequence in . ∎
4. Essentially coercive forms
Let be a separable Hilbert space over or and be a sesquilinear form satisfying
[TABLE]
for some . Then we may associate with the operator defined by
[TABLE]
If is coercive, i.e. if
[TABLE]
for some , then is invertible. This consequence is the well-known Lax-Milgram lemma.
Remark 4.1**.**
The notion of coercivity is not uniform in the literature. Ours is the natural hypothesis of the Lax-Milgram Lemma and is conform with the Wikipedia entry ”Babuska-Lax-MilgramTheorem”. In non-linear analysis there is a wide agreement on this notion: In the real case, a possibly non-linear operator is called coercive if there exists a function such that as and for all . If is linear this is equivalent to the existence of such that
[TABLE]
i.e. our condition without the absolute value. This is a ”forcing condition” which justifies the name coercive. Other authors prefer the word ellipticity, see e.g. [15], [21]. We use elliptic for shifted coercivity in [1], see also the remark at the end of this section.
Our aim is to find weaker assumptions than coercivity which help to decide whether the operator is invertible.
Note that is coercive if and only if
[TABLE]
We weaken this property in the following way.
Definition 4.2** (Essential coercivity).**
The continuous sesquilinear form (or the operator ) is called essentially coercive if for each sequence in weakly converging to [math] and such that , one has .
The following is a characterization of this new property.
Theorem 4.3**.**
The following assertions are equivalent:
- (i)
the form is essentially coercive; 2. (ii)
there exist an orthogonal projection of finite rank and such that
[TABLE] 3. (iii)
there exist a Hilbert space , a compact operator and such that
[TABLE] 4. (iv)
there exist a compact operator and such that
[TABLE]
Proof.
: Let be an orthonormal basis of and consider the orthogonal projections given by
[TABLE]
Assume that (ii) is false for every . Then there exists a sequence such that and
[TABLE]
Note that, since is a self-adjoint operator,
[TABLE]
with for all . This implies that converges weakly to [math]. Since , it follows that converges weakly to [math]. Moreover . Therefore is not essentially coercive.
: Choose and .
: There exists a unique operator such that
[TABLE]
for all . Choose .
: Let that tends weakly to [math] and such that tends to [math] as . Since is compact, as . Hence as . By assumption there exists such that
[TABLE]
It follows that as . ∎
Next we want to justify the notion ”essentially coercive”. We recall that by the Toeplitz–Hausdorff theorem [14], the numerical range of ,
[TABLE]
is a convex set. Hence also is convex. For ,
[TABLE]
if and only if
[TABLE]
where in the real case and if . This observation leads to the following more precise description of coercivity.
Lemma 4.4**.**
The form is coercive if and only if there exist and with such that
[TABLE]
Proof.
We give the proof for . Assume that is coercive. There exists a maximal such that . Then there exists of modulus ; i.e. for some . The set is convex and closed. Moreover and . This implies that for all . Indeed, let such that . Then the segment has a non-empty intersection with . Since is convex it follows that .
Conversely, clearly, if there exists such that for all , then is coercive. ∎
Theorem 4.5**.**
Let . The following assertions are equivalent:
- (i)
the operator is essentially coercive; 2. (ii)
there exists a finite rank operator such that is coercive; 3. (iii)
there exists a compact operator such that is coercive.
Proof.
: Choose the orthogonal finite rank projection on and as in Theorem 4.3 (ii). Let and . Then and for all . Let be the Riesz isomorphism given by
[TABLE]
Let . Then for all . Moreover has a matrix decomposition
[TABLE]
according to the decomposition of . Since is orthogonal, is coercive. Thus, by Lemma 4.4, there exists such that and
[TABLE]
for all . Since , there exists a finite rank operator such that
[TABLE]
Choose a further finite rank perturbation such that
[TABLE]
Since is orthogonal, for , we get
[TABLE]
Hence
[TABLE]
Now let . Then is coercive.
is obvious.
: Condition implies clearly Condition of Theorem 4.3; thus the claim follows from that theorem. ∎
Corollary 4.6**.**
Let be a continuous essentially coercive sesquilinear form. The following assertions are equivalent:
- (i)
for all there exists a unique such that
[TABLE] 2. (ii)
* for all implies that (uniqueness);* 3. (iii)
for all there exists such that for all (existence).
Proof.
The assertion (i) means that is invertible, the assertion (ii) means that is injective and the assertion (iii) means that is surjective. By Theorem 4.5, there exists a compact operator such that is invertible.
: Assume that is injective. Write
[TABLE]
Then also is injective. Since is compact, it follows from the classical Fredholm alternative that is invertible. Consequently also is invertible.
: If is surjective, write to conclude that is surjective. Again we deduce that is invertible and so is . ∎
Remark 4.7**.**
In the previous corollary we deduced from Theorem 4.5 the Fredholm alternative. This conclusion is well-known, if a compact perturbation is given, see for example [32, Theorem 22.D], or [15, Lemma 6.108]. Our point is that a priori it is not at all clear that the topological condition defining essential coercivity implies that the form is a compact perturbation of a coercive form. This is what Theorem 4.5 shows. Note that, in [23, p229], our notion of essential coercivity is attributed, under the name “condition (S)”, to Felix Browder [6] if we identify the operator with a form.
Moreover, we deduce from Theorem 4.5 the following properties of essential coercivity.
Corollary 4.8**.**
- (a)
The set of all essentially coercive operators on is open in . 2. (b)
If is essentially coercive and is compact, then is essentially coercive. 3. (c)
If is essentially coercive, then is a Fredholm operator of index [math].
The following example shows that the invertibility of does not imply the essential coercivity of .
Example 4.9**.**
Let , and
[TABLE]
Let be the Riesz isomorphism introduced in the proof of Theorem 4.5. Then is a diagonal operator with merely and in the diagonal. Thus and obviously are clearly invertible. Let where the is a coordinate for and . Then and tends weakly to [math] as . Moreover for all , which shows that is not essentially coercive.
Remark 4.10**.**
Let . In [1] a continuous sesquilinear form is called compactly elliptic if there exists a compact operator , where is some Hilbert space and there exists such that
[TABLE]
In view of Theorem 4.3, each compactly elliptic form is essentially coercive. In fact the following holds: the form is essentially coercive if and only if there exists such that is compactly elliptic.
Proof.
If is compactly elliptic, then is essentially coercive and hence also is essentially coercive. Conversely, let be essentially coercive. By Theorem 4.5, there exists a compact operator such that the form defined by
[TABLE]
is coercive. By Lemma 4.4 there exist of modulus one and such that for all . Now let be the Riesz isomorphism. Then is compact. Choosing we see that is compactly elliptic. It follows from [1, Proposition 4.4 (b)] that is compactly elliptic. ∎
5. Characterization of the universal Galerkin property
In this section we want to characterize those forms on a Hilbert space for which every Galerkin approximation converges, whatever be the choice of the approximating sequence.
Let be a separable, infinite dimensional separable Hilbert space over or , and let be a continuous sesquilinear form. Given we again consider solutions of the problem:
[TABLE]
We say that the form satisfies uniqueness if for ,
[TABLE]
We say that (5.1) is well-posed if for all there exists a unique solution .
Definition 5.1** (Universal Galerkin property).**
The sesquilinear and continuous form has the universal Galerkin property if (5.1) is well-posed and the following holds. Let be an arbitrary approximating sequence of . Then there exist and such that for each and each , there exists a unique solving
[TABLE]
and
[TABLE]
where is the solution of (5.1).
As recalled in the introduction and in the preceding section, the Lax-Milgram Theorem and Céa’s Lemma imply the universal Galerkin property if is coercive. We now show that the weaker notion of essential coercivity also provides a sufficient condition for ensuring the universal Galerkin property, and moreover that it is necessary.
Theorem 5.2**.**
The following assertions are equivalent.
- (i)
The form is essentially coercive and satisfies uniqueness. 2. (ii)
The form has the universal Galerkin property.
Proof.
: let be an approximating sequence in . By Theorem 2.4 it suffices to show that there exist and such that
[TABLE]
Assume that (5.2) is false. We then find a subsequence and such that and
[TABLE]
We may assume that converges weakly to taking a further subsequence otherwise. Let . Then there exist such that . Thus
[TABLE]
It follows from the uniqueness assumption that . Thus converges weakly to [math], , but for all . Therefore the form is not essentially coercive.
: the uniqueness condition is part of . It remains to show that is essentially coercive. Let be an orthonormal basis of and . By our assumption, there exist and for all an operator such that
[TABLE]
Denote by the orthogonal projection. Define the operator
[TABLE]
by
[TABLE]
Now assume that is not essentially coercive. Then it follows from Theorem 4.3 that for all we find such that and
[TABLE]
In particular . This implies that . Let . Then and are both approximating sequences. Let and let be arbitrary with unit norm. There exist a unique and such that
[TABLE]
where . Thus
[TABLE]
Consequently and, since , it follows that
[TABLE]
which implies that , i.e. .
Observe that the definition of implies that . Hence
[TABLE]
Consequently
[TABLE]
Thus is violated for the approximating sequence . But then does not hold by Theorem 2.4, which shows that the assumption that is not essentially coercive is false.
∎
It is obvious that a form is essentially coercive if and only if its adjoint is essentially coercive. However, a surprising consequence of Theorem 5.2 is that, for an essentially coercive form, uniqueness for the form and uniqueness for its adjoint are equivalent, as the following corollary shows.
Corollary 5.3**.**
Let be a separable Hilbert space on and be a continuous essentially coercive form. The following assertions are equivalent:
- (i)
for all , for all implies ; 2. (ii)
for all , for all implies ; 3. (iii)
for all there exists in such that , for all ; 4. (iv)
for all there exists in such that , for all .
Proof.
: this follows from Theorem 5.2 and Theorem 2.4. The other equivalences follow from Corollary 4.6. ∎
6. The Aubin-Nitsche trick revisited
In this section we want to prove that on suitable Hilbert spaces containing the space continuously the approximation speed in the Galerkin approximation can be improved. We refer also to [28] for related, but different results in this direction.
Let be a separable Hilbert space over or , and a sesquilinear form satisfying
[TABLE]
Let be an approximating sequence of . We assume that (BNB) holds; i.e. there exists such that
[TABLE]
Given and , let be the solution of
[TABLE]
and the solution of
[TABLE]
Note that, by subtracting (6.3) and (6.2), we obtain the following Galerkin orthogonality:
[TABLE]
We know from Proposition 2.5 and Proposition 2.6 that
[TABLE]
for all . We want to improve this estimate if the given data is in a suitable subspace of .
Let ; i.e. is a Banach space such that and
[TABLE]
for all and some . We define for
[TABLE]
where the distance is taken in . Thus
[TABLE]
where is the isomorphism given by
[TABLE]
Thus, if is the solution of (6.3) and the approximate solution of (6.2), then, if , we have the estimate
[TABLE]
which has the advantage of being uniform for in the unit ball of .
Remark 6.1**.**
Let be the solution operator for (6.2). Then (6.8) says that
[TABLE]
We can characterize when as .
Proposition 6.2**.**
One has
[TABLE]
Proof.
Denote by the orthogonal projection onto . Then
[TABLE]
where is the canonical injection. If is compact, then , where is the unit ball of , is relatively compact in . Now, converges strongly to the identity of . Since , this convergence is uniform on compact subsets of . This shows that as .
Conversely, if , then is compact as limit of finite rank operators. Then also is compact. ∎
Similarly, we define
[TABLE]
where is given by
[TABLE]
As before we have defined as but with replaced by the adjoint form of . Thus we have for all ,
[TABLE]
Now we apply the Aubin–Nitsche trick in the following proof. In contrast to the literature [12] we allow non-selfadjoint forms and also let where is arbitrary. However, as usual, we fix a Hilbert space such that with dense range. Thus we have the Gelfand triple
[TABLE]
Now we let be another Banach space in which we choose the given data , whereas our error estimate is done with respect to the norm of .
Theorem 6.3**.**
Let and let be the solution of (6.3), the solution of (6.2). Then
[TABLE]
for all .
Proof.
Let . Then, on the footsteps of Aubin–Nitsche, we consider the solution of
[TABLE]
Then, by (6.11), for any ,
[TABLE]
where in the last identity we used the Galerkin orthogonality (6.4).
Since is arbitrary, this implies that
[TABLE]
Now we use (6.8) and (6.9) to deduce
[TABLE]
Consequently, we obtain
[TABLE]
∎
7. Applications
7.1. Selfadjoint positive operators with compact resolvent
As an illustration, we apply Theorem 6.3 to selfadjoint positive operators with compact resolvent. Let be infinite dimensional, separable Hilbert spaces over or such that is compactly injected in and dense in . Thus we have the Gelfand triple
[TABLE]
Let be continuous, symmetric and coercive. Then the operator given by
[TABLE]
is invertible. Moreover, there exist an orthonormal basis of and such that
[TABLE]
and
[TABLE]
(see e.g. [2, Satz 4.49]) and
[TABLE]
Passing to an equivalent scalar product we may and will assume that
[TABLE]
Thus and ; i.e. we have in the above estimates.
Consider . Then is an approximating sequence of . We define for
[TABLE]
which is a Hilbert space for the norm
[TABLE]
Then it is easy to see that , , with identity of the norms. Morever, for ,
[TABLE]
(the complex interpolation space) and for ,
[TABLE]
Lemma 7.1**.**
One has for ,
[TABLE]
In particular,
[TABLE]
Proof.
Let . Then is an orthonormal basis of . For ,
[TABLE]
Thus
[TABLE]
defines the orthogonal projection of onto . Moreover, in one has
[TABLE]
Let , . Then
[TABLE]
Thus
[TABLE]
since is increasing.
Taking , one sees that . ∎
Now let , where and let . Let such that
[TABLE]
i.e. is the approximate solution. Then by Theorem 6.3
[TABLE]
Thus we obtain the following error estimate
[TABLE]
Remark 7.2**.**
In this special case one can compute the error directly. In fact and . Thus
[TABLE]
which is exactly the estimate (7.1). This means that Theorem 6.3 gives the best possible estimate of the error.
Let us provide an example of application of (7.1). Let , with norm . Let with norm
[TABLE]
Then the injection is compact. Let be given by
[TABLE]
Let . Then there exists a unique such that
[TABLE]
In fact, is the unique element of such that for all .
For , let be the -th Fourier coefficient. Then
[TABLE]
Let . Then is an orthonormal basis of and . Let and let be the approximate solution i.e.
[TABLE]
Then our estimate shows that
[TABLE]
Let and . If , then by (7.1),
[TABLE]
7.2. Finite elements for the Poisson problem
In this section we want to apply our results to show the convergence of a numerical approximation via triangularization for the solution of a Poisson problem where coercivity is violated but essential coercivity holds. For simplicity we choose throughout this section. Let be an open, bounded, convex set and let () be Lipschitz continuous functions such that
[TABLE]
for all , where . Moreover, let for and . We consider the operator given by
[TABLE]
Note that is linear and continuous.
Our aim is to study the Poisson equation
[TABLE]
where is given and a solution is to be determined and calculated by approximation. We will impose the uniqueness condition
[TABLE]
We use the continuous, coercive form
[TABLE]
given by
[TABLE]
and also the perturbed form given by
[TABLE]
Note that the adjoint form defined by has the same form as . This is the reason why we also consider the coefficients .
Then the following well posedness result holds.
Theorem 7.3**.**
- i)
The form is essentially coercive. 2. ii)
Assume (7.4). Then for each there exists a unique solution of (7.3).
Proof.
a) We first show -regularity. Let , such that for all . Then and . In fact, let
[TABLE]
Then and for all . Now it follows from the classical -result of Kadlec [16] (see [13, Theorem 3.2.1.2]) that . It clearly follows that .
b) We show that is essentially coercive. Let as in and as . Then as in . Since the embedding of in is compact, it follows that in . Consequently
[TABLE]
Thus also as . Since is coercive this implies as .
c) The form satisfies uniqueness. In fact, let such that for all . Then by part a) of the proof. Hence by our assumption (7.4).
d) Let . It follows from Corollary 4.6 that there exists a unique such that for all . Now a) implies that and .
∎
Concerning the uniqueness property, we make the following remark.
Remark 7.4** (Eigenvalues and uniqueness).**
Replace the operator by (i.e. by ) where . Then there exists a finite or countable infinite set such that
[TABLE]
*where and , if .
If and , then for all and then we are in the coercive case. But in general there will be also negative eigenvalues. The uniqueness condition (7.4) for is equivalent to saying that for all .*
Our final aim is to show that the finite element method yields an approximation of the solution of (7.3).
For that purpose we assume that and that is a convex polygon. Let be a quasi-uniform admissible triangularization of (see [2, Definition 9.26]). In particular each consists of finitely many triangles covering of outer radius .
For , we consider the corresponding finite element space (see [2, Equation (9.35)]). Thus consists of those continuous functions on which vanish at and are affine on each triangle .
The following fundamental estimates are classical (see e.g. [2, Korollar 9.28]) .
Proposition 7.5**.**
There exists a constant such that for all and for each ,
[TABLE]
where .
Note that Proposition 7.5 shows how we can approximate functions in by finite elements and so far there is no relation with the solutions of the Poisson equation.
We assume the uniqueness condition (7.4). Then by Theorem 5.2, since the form is essentially coercive, there exists such that for and
[TABLE]
Let . Since is finite dimensional, it follows from (7.6) that for all , there exists a unique such that
[TABLE]
The finite elements are the approximation of the solution of (7.3) we are interested in. They converge in with convergence order and in with convergence order . More precisely, the following is our main theorem of this section.
Theorem 7.6**.**
Let and consider the approximate solutions , . Then there exist and constants independent of such that
[TABLE]
and
[TABLE]
where is the solution of (7.3).
Proof.
Applying the closed graph theorem in the situation of Theorem 7.3, we find a constant such that
[TABLE]
whenever and solves (7.3).
By Theorem 5.2, there exist , , both independent of , such that
[TABLE]
for all . Thus (7.5) implies that for ,
[TABLE]
Now (7.8) follows from (7.10).
Next we establish the -estimate (7.9). For that we compute using (7.5),
[TABLE]
Since , it follows from (7.10) that for all .
The same estimate is true for . Now assume that (7.9) is false. Then there exists a sequence as such that (7.9) does not hold for all and any constant . This contradicts Theorem 6.3. ∎
Remark 7.7**.**
There are other methods to approximate the solution of a non-coercive advection-diffusion equation as (7.3). In fact, Le Bris, Legoll and Madiot [19] use the Banach-Nečas-Babuska lemma (instead of essential coercivity as we do) and a special measure to construct an approximation.
The advantage is that no initial mesh has to be considered; on the other hand there seems to be no such precise error estimate as our quadratic convergence obtained in Theorem 7.6 even though numerical examples are given in [19].
Still, another approach (based on Fredholm perturbation) is presented by Christensen [9], which also involves the Babuska inf-sup condition.
Finally, let us mention the works by Droniou, Gallouët and Herbin [10], based on finite volume methods, which also present the advantage to provide an approximate solution for this problem on any admissible mesh.
One of the first results on the Galerkin method in a special non-coercive case are due to Schatz [27] and Schatz–Wang [28].
8. Supplement: saddle point problems
Brezzi’s contribution [4] is a version of (BNB) which implies the convergence of the Galerkin approximation in the case of saddle point problems. Let us consider the case where and are real Hilbert spaces and and are continuous bilinear forms in the sense that there exists with
[TABLE]
and
[TABLE]
Then, given , the continuous saddle point problem consists in finding such that
[TABLE]
Example 8.1**.**
An important example is the Stokes problem (motivating some investigation by Ladyžhenskaya [18]), with , where is the space dimension, (the space of -functions with null average),
[TABLE]
and
[TABLE]
The approximation of the saddle point problem is then generally done by a mixed method [4, 5], letting, for , and be approximating sequences in the spaces and , respectively, in the sense of Definition 2.1, and looking for such that
[TABLE]
We call this the approximate saddle point problem.
The following result shows that conditions (8.1) (which are Brezzi’s conditions [4, Hypotheses H1 and H2]) are sufficient for the convergence of the solutions of the approximate saddle point problems. This is proved by Brezzi [4, Theorem 2.1], where a solution is assumed to exist. However, similar to the proof of our Proposition 2.5, one can show that Brezzi’s conditions imply existence and uniqueness of the continuous saddle point problem. Indeed, following the proof of (8.2) given in the proof of [4, Theorem 2.1], letting and , we get a bound on the approximate solution, and a solution of the continuous problem can be obtained by passing to the limit of a weakly converging subsequence. Uniqueness follows from the estimate (8.2) proved by Brezzi. For , define
[TABLE]
and assume that and for all .
Theorem 8.2** (Brezzi).**
Assume that there exists such that
[TABLE]
Then, given , there exists a unique solution of the continuous saddle point problem and for each a unique solution of the approximate saddle point problem. Moreover,
[TABLE]
where the constant depends only on and .
The saddle point problem can be cast in our framework by letting , , and
[TABLE]
Given , define by
[TABLE]
Then is a solution of the continuous saddle point problem if and only if (1.1) is satisfied. Moreover, letting , a vector satisfies (1.2) if and only if is a solution of the approximate saddle point problem. Thus our Theorem 2.4 shows that the convergence property expressed in Brezzi’s Theorem is equivalent to (BNB) for the form and the approximating sequence . We can use this to show the following converse result of Brezzi’s Theorem.
Theorem 8.3**.**
Assume that, given , for each , there is a unique solution of the discrete saddle point problem and that Then Brezzi’s conditions (8.1) hold.
Proof.
We know from Theorem 2.4 and Proposition 2.5 that (BNB) is satisfied for some . We endow the space with the norm \|u\|_{{\mathcal{V}}}=\big{(}\|w\|_{{\mathcal{W}}}^{2}+\|p\|_{{\mathcal{Y}}}^{2}\big{)}^{1/2} for (it is then a Hilbert space as well). Let be given. We then have,
[TABLE]
Let us first choose, for any , , which means that . Let attaining the supremum value in (8.3). We then have, from the definition of in this framework of a saddle point problem,
[TABLE]
which implies that and
[TABLE]
This proves (8.1). , and thus that the operator , defined for all by
[TABLE]
is bijective from to .
Let and let be defined by
[TABLE]
Choose an element attaining the supremum value in (8.3) for this choice of . We then write , with and , which can be written as for some . We have
[TABLE]
Moreover, since and , and
[TABLE]
by definition of and of . Hence
[TABLE]
This implies that , and therefore is such that
[TABLE]
where we take into account that by Pythagore’s theorem. This concludes the proof of (8.1)..
The equivalence between and allows to obtain the proof of (8.1). (with the same , see Proposition 2.9), following the same path.
∎
In conclusion, Brezzi’s conditions (8.1) are equivalent to the well posedness of the continuous saddle point problem together with the convergence of the approximate solutions to the solution, and they are also equivalent to (BNB) for the form and the approximating sequence of .
Note that [5, Chapter II, Remark 2.11] provides a comment on the fact that (8.1). is a necessary condition.
Acknowledgments: We are most grateful to Gilles Lancien about a discussion on the approximation property and pointing out the survey article of Casazza [7] to us. We also thank the anonymous referee for useful and inspiring comments. This research is partly supported by the Bézout Labex, funded by ANR, reference ANR-10-LABX-58.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] W. Arendt, A. F. M. ter Elst, J. B. Kennedy, and M. Sauter. The Dirichlet-to-Neumann operator via hidden compactness. J. Funct. Anal. , 266(3):1757–1786, 2014.
- 2[2] W. Arendt and K. Urban. Partielle Differenzialgleichungen. Eine Einführung in analytische und numerische Methoden. Berlin: Springer Spektrum, 2nd edition edition, 2018.
- 3[3] I. Babuška. Error-bounds for finite element method. Numer. Math. , 16:322–333, 1970/71.
- 4[4] F. Brezzi. On the existence, uniqueness and approximation of saddle-point problems arising from Lagrangian multipliers. Rev. Française Automat. Informat. Recherche Opérationnelle Sér. Rouge , 8(R-2):129–151, 1974.
- 5[5] F. Brezzi and M. Fortin. Mixed and hybrid finite element methods , volume 15 of Springer Series in Computational Mathematics . Springer-Verlag, New York, 1991.
- 6[6] F. E. Browder. Nonlinear operators and nonlinear equations of evolution in Banach spaces. In Nonlinear functional analysis (Proc. Sympos. Pure Math., Vol. XVIII, Part 2, Chicago, Ill., 1968) , pages 1–308, 1976.
- 7[7] P. G. Casazza. Chapter 7 - approximation properties. In W. Johnson and J. Lindenstrauss, editors, Handbook of the Geometry of Banach Spaces , volume 1 of Handbook of the Geometry of Banach Spaces , pages 271 – 316. Elsevier Science B.V., 2001.
- 8[8] L. Chesnel and P. jun. Ciarlet. T 𝑇 T -coercivity and continuous Galerkin methods: application to transmission problems with sign changing coefficients. Numer. Math. , 124(1):1–29, 2013.
