Local Fourier analysis for mixed finite-element methods for the Stokes equations
Yunhui He, Scott P. MacLachlan

TL;DR
This paper develops a local Fourier analysis for multigrid methods applied to mixed finite-element discretizations of the Stokes equations, providing insights into convergence behavior and optimizing relaxation parameters.
Contribution
It introduces a local Fourier analysis framework for block-structured relaxation schemes in mixed finite-element methods for Stokes equations, including parameter optimization and numerical validation.
Findings
Optimized relaxation parameters improve convergence.
Local Fourier analysis predicts two-grid convergence for stabilized methods.
Numerical experiments validate theoretical convergence factors.
Abstract
In this paper, we develop a local Fourier analysis of multigrid methods based on block-structured relaxation schemes for stable and stabilized mixed finite-element discretizations of the Stokes equations, to analyze their convergence behavior. Three relaxation schemes are considered: distributive, Braess-Sarazin, and Uzawa relaxation. From this analysis, parameters that minimize the local Fourier analysis smoothing factor are proposed for the stabilized methods with distributive and Braess-Sarazin relaxation. Considering the failure of the local Fourier analysis smoothing factor in predicting the true two-grid convergence factor for the stable discretization, we numerically optimize the two-grid convergence predicted by local Fourier analysis in this case. We also compare the efficiency of the presented algorithms with variants using inexact solvers. Finally, some numerical experiments…
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.618 | 0.618 | 0.382 | 0.236 | 0.236 | 0.146 | |
| 0.564 | 0.568 | 0.349 | 0.215 | 0.214 | 0.133 | |
| 0.561 | 0.568 | 0.348 | 0.215 | 0.214 | 0.132 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.338 | 0.338 | 0.115 | 0.078 | 0.078 | 0.061 | |
| 0.324 | 0.324 | 0.112 | 0.074 | 0.075 | 0.074 | |
| 0.324 | 0.324 | 0.112 | 0.075 | 0.075 | 0.073 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.670 | 0.670 | 0.449 | 0.300 | 0.300 | 0.201 | |
| 0.652 | 0.652 | 0.436 | 0.291 | 0.292 | 0.196 | |
| 0.651 | 0.652 | 0.435 | 0.291 | 0.291 | 0.195 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.333 | 0.333 | 0.112 | 0.079 | 0.079 | 0.062 | |
| 0.324 | 0.324 | 0.112 | 0.074 | 0.075 | 0.074 | |
| 0.324 | 0.324 | 0.112 | 0.075 | 0.075 | 0.073 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.333 | 0.333 | 0.111 | 0.079 | 0.079 | 0.062 | |
| 0.324 | 0.323 | 0.112 | 0.075 | 0.075 | 0.058 | |
| 0.323 | 0.323 | 0.112 | 0.075 | 0.075 | 0.058 |
| Sweep | |||||
|---|---|---|---|---|---|
| 1 (Optimized) | 1.2 | 1.1 | 0.7 | 0.679 | 0.679 |
| 1 | 1.0 | 8/9 | 1.0 | 0.669 | 0.735 |
| 2 (Optimized) | 1.1 | 1.0 | 1.0 | 0.366 | 0.366 |
| 2 | 1.0 | 8/9 | 1.0 | 0.461 | 0.461 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.366 | 0.366 | 0.167 | 0.128 | 0.128 | 0.106 | |
| 0.352 | 0.353 | 0.160 | 0.120 | 0.120 | 0.100 | |
| 0.352 | 0.353 | 0.160 | 0.122 | 0.122 | 0.100 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.366 | 0.366 | 0.167 | 0.128 | 0.128 | 0.106 | |
| 0.456 | 0.453 | 0.245 | 0.197 | 0.200 | 0.167 | |
| 0.459 | 0.462 | 0.257 | 0.206 | 0.211 | 0.175 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.333 | 0.333 | 0.111 | 0.079 | 0.079 | 0.062 | |
| 0.368(2) | 0.346(2) | 0.131(2) | 0.075(2) | 0.075(2) | 0.059(1) | |
| 0.343(2) | 0.351(2) | 0.111(2) | 0.075(2) | 0.075(2) | 0.063(1) |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.673 | 0.673 | 0.111 | 0.079 | 0.079 | 0.062 | |
| 0.585 | 0.585 | 0.112 | 0.075 | 0.075 | 0.058 | |
| 0.584 | 0.584 | 0.112 | 0.075 | 0.075 | 0.058 |
| Sweep | |||||
|---|---|---|---|---|---|
| 1 (Optimized) | 1.6 | 0.8 | 1 | 0.714 | 0.714 |
| 1 | 1.2 | 16/15 | 1.0 | 0.718 | 1.027 |
| 2 (Optimized) | 1.2 | 0.9 | 1.2 | 0.494 | 0.445 |
| 2 | 1.2 | 16/15 | 1.0 | 0.431 | 0.549 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.445 | 0.445 | 0.319 | 0.262 | 0.262 | 0.225 | |
| 0.418 | 0.420 | 0.301 | 0.251 | 0.250 | 0.212 | |
| 0.420 | 0.420 | 0.304 | 0.250 | 0.249 | 0.212 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.445 | 0.445 | 0.319 | 0.262 | 0.262 | 0.225 | |
| 0.739 | 0.740 | 0.340 | 0.304 | 0.299 | 0.268 | |
| 0.736 | 0.735 | 0.342 | 0.309 | 0.311 | 0.276 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 0.673 | 0.673 | 0.111 | 0.079 | 0.079 | 0.062 | |
| 0.680(4) | 0.677(1) | 0.112(3) | 0.075(2) | 0.075(2) | 0.059(1) | |
| 0.659(1) | 0.662(1) | 0.112(3) | 0.075(2) | 0.075(2) | 0.067(1) |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 4.893 | 4.893 | 0.249 | 0.109 | 0.109 | 0.090 | |
| NAN | NAN | 0.434 | 0.131 | 0.130 | 0.085 | |
| NAN | NAN | 0.437 | 0.130 | 0.130 | 0.085 |
| Cycle | ||||||
|---|---|---|---|---|---|---|
| 4.893 | 4.893 | 0.249 | 0.109 | 0.109 | 0.090 | |
| 491.373 | 492.094 | 0.240 | 0.104 | 0.104 | 0.085 | |
| NAN | NAN | 0.240 | 0.104 | 0.104 | 0.085 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Local Fourier analysis for mixed finite-element methods for the Stokes equations
Yunhui He
Scott P. MacLachlan
Department of Mathematics and Statistics, Memorial University of Newfoundland, St. John’s, NL A1C 5S7, Canada
Abstract
In this paper, we develop a local Fourier analysis of multigrid methods based on block-structured relaxation schemes for stable and stabilized mixed finite-element discretizations of the Stokes equations, to analyze their convergence behavior. Three relaxation schemes are considered: distributive, Braess-Sarazin, and Uzawa relaxation. From this analysis, parameters that minimize the local Fourier analysis smoothing factor are proposed for the stabilized methods with distributive and Braess-Sarazin relaxation. Considering the failure of the local Fourier analysis smoothing factor in predicting the true two-grid convergence factor for the stable discretization, we numerically optimize the two-grid convergence predicted by local Fourier analysis in this case. We also compare the efficiency of the presented algorithms with variants using inexact solvers. Finally, some numerical experiments are presented to validate the two-grid and multigrid convergence factors.
keywords:
Monolithic multigrid, Block-structured relaxation, local Fourier analysis, mixed finite-element methods, Stokes Equations
MSC:
65N55, 65F10, 65F08, 76M10
1 Introduction
In recent years, substantial research has been devoted to efficient numerical solution of the Stokes and Navier-Stokes equations, due both to their utility as models of (viscous) fluids and their commonalities with many other physical problems that lead to saddle-point systems (see, for example [1], and many of the other references cited here). In the linear (or linearized) case, solution of the resulting matrix equations is seen to be difficult, due to indefiniteness and the usual ill-conditioning of discretized PDEs. In the literature, block preconditioners (cf. [1] and the references therein) are widely used, due to their easy construction from standard multigrid algorithms for scalar elliptic PDEs, such as algebraic multigrid [2]. However, monolithic multigrid approaches [3, 4, 5, 6, 7] have been shown to outperform these preconditioners when algorithmic parameters are properly chosen [8, 9]. The focus of this work is on the analysis of such monolithic multigrid methods in the case of stable and stabilized finite-element discretizations of the Stokes equations.
Local Fourier analysis (LFA) [10, 11] has been widely used to predict the convergence behavior of multigrid methods, to help design relaxation schemes and choose algorithmic parameters. In general, the LFA smoothing factor provides a sharp prediction of actual multigrid convergence, see [10], under the assumption of an “ideal” coarse-grid correction scheme (CGC) that annihilates low-frequency error components and leaves high-frequency components unchanged. In practice, the LFA smoothing and two-grid convergence factors often exactly match the true convergence factor of multigrid applied to a problem with periodic boundary conditions [12, 13, 10]. Recently, the validity of LFA has been further analysed [14], extending this exact prediction to a wider class of problems. However, the LFA smoothing factor is also known to lose its predictivity of the true convergence in some cases [15, 16, 17]. In particular, the smoothing factor of LFA overestimates the two-grid convergence factor for the Taylor-Hood () discretization of the Stokes equations with Vanka relaxation [16]. Even for the scalar Laplace operator, the LFA smoothing factor fails to predict the observed multigrid convergence factor for higher-order finite-element methods [15].
Two main questions interest us here. First, we look to extend the study of [16] to consider LFA of block-structured relaxation schemes for finite-element discretizations of the Stokes equations. Secondly, we consider if the LFA smoothing factor can predict the convergence factors for these relaxation schemes. Recently, LFA for multigrid based on block-structured relaxation schemes applied to the marker-and-cell (MAC) finite-difference discretization of the Stokes equations was shown to give a good prediction of convergence [18], in contrast to the results of [16]. Thus, a natural question to investigate is whether the contrasting results between [18] and [16] is due to the differences in discretization or those in the relaxation schemes considered. Here, we apply the relaxation schemes of [18] to the discretization from [16], as well as an “intermediate” discretization using stabilized approaches.
In recent decades, many block relaxation schemes have been studied and applied to many problems, including Braess-Sarazin-type relaxation schemes [3, 19, 20, 4, 21], Vanka-type relaxation schemes [3, 22, 4, 16, 23, 24, 7], Uzawa-type relaxation schemes [25, 26, 27, 6, 28], distributive relaxation schemes [29, 5, 30, 31, 32] and other types of methods [33, 34]. Even though LFA has been applied to distributive relaxation [35, 11], Vanka relaxation [16, 24, 36, 37], and Uzawa-type schemes [26] for the Stokes equations, most of the existing LFA has been for relaxation schemes using (symmetric) Gauss-Seidel (GS) approaches, and for simple finite-difference and finite-element discretizations. Considering modern multicore and accelerated parallel architectures, we focus on schemes based on weighted Jacobi relaxation with distributive, Braess-Sarazin, and Uzawa relaxation for common finite-element discretizations of the Stokes equations.
Some key conclusions of this analysis are as follows. First, while the LFA smoothing factor gives a good prediction of the true convergence factor for the stabilized discretizations with distributive weighted Jacobi and Braess-Sarazin relaxation, it does not for the Uzawa relaxation (in contrast to what is seen for the MAC discretization [18, 35]). For no cases does the LFA smoothing factor offer a good prediction of the true convergence behaviour for the (stable) discretization, suggesting that the discretization is responsible for the lack of predictivity, consistent with the results in [15, 16]. For both stable and stabilized discretizations, we see that standard distributive weighted Jacobi relaxation loses some of its high efficiency, in contrast to what is seen for the MAC scheme [18, 35] but that robustness can be restored with an additional relaxation sweep. Exact Braess-Sarazin relaxation is also highly effective, with LFA-predicted convergence factors of in the stabilized cases and in the stable case. To realize these rates with inexact cycles, however, requires nested W-cycles to solve the approximate Schur complement equation accurately enough in the stabilized case, although simple weighted Jacobi on the approximate Schur complement is observed to be sufficient in the stable case. For Uzawa-type relaxation, we see a notable gap between predicted convergence with exact inversion of the resulting Schur complement, versus inexact inversion, although some improvement is seen when replacing the approximate Schur complement with a mass matrix approximation, as is commonly used in block-diagonal preconditioners [38, 39, 40]. Overall, however, we see that distributive weighted Jacobi (DWJ) (with 2 sweeps of Jacobi relaxation on the pressure equation) outperforms both Braess-Sarazin relaxation (BSR) and Uzawa relaxation, for the stabilized discretizations, while DWJ and inexact BSR offer comparable performance for the stable discretization.
We organize this paper as follows. In Section 2, we introduce two stabilized and the stable mixed finite-element discretizations of the Stokes equations in two dimensions (2D). In Section 3, we first review the LFA approach, then discuss the Fourier representation for these discretizations. In Section 4, LFA is developed for DWJ, BSR, and Uzawa-type relaxation, and optimal LFA smoothing factors are derived for the two stabilized methods with DWJ and BSR. Multigrid performance is presented to validate the theoretical results. Section 5 exhibits optimized LFA two-grid convergence factors and measured multigrid convergence factors for the discretization. Furthermore, a comparison of the cost and effectiveness of the relaxation schemes is given. Conclusions are presented in Section 6.
2 Discretizations
In this paper, we consider the Stokes equations,
[TABLE]
where is the velocity vector, is the (scalar) pressure of a viscous fluid, and represents a (known) forcing term, together with suitable boundary conditions. Because of the nature of LFA, we validate our predictions against the problem with periodic boundary conditions on both and . Discretizations of (1) typically lead to a linear system of the following form:
[TABLE]
where corresponds to the discretized vector Laplacian, and is the negative of the discrete divergence operator. If the discretization is naturally unstable, then is the stabilization matrix, otherwise . In this paper, we discuss two stabilized and the stable finite-element discretizations.
The natural finite-element approximation of Problem (1) is: Find and such that
[TABLE]
where
[TABLE]
and , are finite-element spaces. Here, satisfies homogeneous Dirichlet boundary conditions in place of any non-homogenous essential boundary conditions on . Problem (3) has a unique solution only when and satisfy an inf-sup condition (see [1, 41, 42, 43]).
2.1 Stabilized discretizations
The standard equal-order approximation of (3) is well-known to be unstable [42, 1]. To circumvent this, a scaled pressure Laplacian term can be added to (3); for a uniform mesh with square elements of size , we subtract
[TABLE]
for . With this, the resulting linear system is given by
[TABLE]
where is the Laplacian operator for the pressure. Denote , and , where . From [1], the red-black unstable mode , can be moved from a zero eigenvalue to a unit eigenvalue ( giving stability without loss of accuracy) by choosing so that
[TABLE]
where is the mass matrix. Substituting the bilinear stiffness and mass matrices into (4), we find . We refer to this method as the Poisson-stabilized discretization (PoSD).
An projection to stabilize the discretization, proposed in [43], stabilizes with
[TABLE]
where is the projection from into the space of piecewise constant functions on the mesh. We refer to this method as the projection stabilized discretization (PrSD). The element matrix of (5) is given by
[TABLE]
where is the element mass matrix for the bilinear discretization and . In the projection stabilized method, we can write , where is given by the 9-point stencil
[TABLE]
Applying (4) to , we find that is the optimal choice.
2.2 Stable discretizations
In order to guarantee the well-posedness of the discrete system (2) with , the discretization of the velocity and pressure unknowns should satisfy an inf-sup condition,
[TABLE]
where is a constant. Taylor-Hood () elements are well known to be stable [41, 1], where the basis functions associated with these elements are biquadratic for each component of the velocity field and bilinear for the pressure.
3 LFA preliminaries
3.1 Definitions and notations
In many cases, the LFA smoothing factor offers a good prediction of multigrid performance. Thus, we will explore the LFA smoothing factor and true (measured) multigrid convergence for the three types of relaxations considered here. We first introduce some terminology of LFA, following [10, 11]. We consider the following two-dimensional infinite uniform grids,
[TABLE]
with
[TABLE]
The coarse grids, , are defined similarly.
Let be a scalar Toeplitz operator defined by its stencil acting on as follows:
[TABLE]
with constant coefficients , where is a function in . Here, is a finite index set. Because is formally diagonalized by the Fourier modes , where and , we use as a Fourier basis with \boldsymbol{\theta}\in\big{[}-\frac{\pi}{2},\frac{3\pi}{2}\big{)}^{2}. High and low frequencies for standard coarsening (as considered here) are given by
[TABLE]
Definition 3.1**.**
If, for all functions ,
[TABLE]
we call the symbol of .
In what follows, we consider linear systems of operators, which read
[TABLE]
where depends on which discretization we use.
For the stabilized approximations, the degrees of freedom for both velocity and pressure are only located on as pictured at left of Figure 1. In this setting, the in (7) are scalar Toeplitz operators. Denote as the symbol of . Each entry in is computed as the (scalar) symbol of the corresponding block of , following Definition 3.1. Thus, is a matrix. All blocks in are diagonalized by the same transformation on a collocated mesh.
However, for the discretization, the degrees of freedom for velocity are located on , containing four types of meshpoints as shown at right of Figure 1. The Laplace operator in (7) is defined by extending (6), with taken to be a finite index set of values, with , V_{X}\subset\big{\{}(z_{x}+\frac{1}{2},z_{y})|(z_{x},z_{y})\in\mathbb{Z}^{2}\big{\}}, V_{Y}\subset\big{\{}(z_{x},z_{y}+\frac{1}{2})|(z_{x},z_{y})\in\mathbb{Z}^{2}\big{\}}, and V_{C}\subset\big{\{}(z_{x}+\frac{1}{2},z_{y}+\frac{1}{2})|(z_{x},z_{y})\in\mathbb{Z}^{2}\bigg{\}}. With this, the (scalar) Laplace operator is naturally treated as a block operator, and the Fourier representation of each block can be calculated based on Definition 3.1, with the Fourier bases adapted to account for the staggering of the mesh points. Thus, the symbols of and are matrices. For more details of LFA for the Laplace operator using higher-order finite-element methods, refer to [15]. Similarly to the Laplace operator, both terms in the gradient, and , can be treated as ()-block operators. Then, the symbols of and are matrices, calculated based on Definition 3.1 adapted for the mesh staggering. The symbols of and are the conjugate transposes of those of and , respectively. Finally, . Accordingly, is a matrix for the discretization.
Definition 3.2**.**
The error-propagation symbol, , for a block smoother on the infinite grid satisfies
[TABLE]
for all , and the corresponding smoothing factor for is given by
[TABLE]
where is an eigenvalue of .
In Definition 3.2, for the stabilized case (and is a matrix) and for the stable case (where is a matrix).
The error-propagation symbol for a relaxation scheme, represented by matrix , applied to either the stabilized or stable scheme is written as
[TABLE]
where represents parameters within , the block approximation to , is an overall weighting factor, and and are the symbols for and , respectively. Note that is a function of some parameters in Definition 3.2. In this paper, we focus on minimizing with respect to these parameters, to obtain the optimal LFA smoothing factor.
Definition 3.3**.**
Let be the set of allowable parameters and define the optimal smoothing factor over as
[TABLE]
If the standard LFA assumption of an “ideal” CGC holds, then the two-grid convergence factor can be estimated by the smoothing factor, which is easy to compute. However, as expected, we will see that this idealized CGC does not lead to a good prediction for some cases we consider below. When the LFA smoothing factor fails to predict the true two-grid convergence factor, the LFA two-grid convergence factor can still be used. Thus, we give a brief introduction to the LFA two-grid convergence factor in the following.
Let
[TABLE]
We use the ordering of for the four harmonics. To apply LFA to the two-grid operator,
[TABLE]
we require the representation of the CGC operator,
[TABLE]
where is the multigrid interpolation operator and is the restriction operator. The coarse-grid operator, , can be either the Galerkin or rediscretization operator.
Inserting the representations of into (8), we obtain the Fourier representation of two-grid error-propagation operator as
[TABLE]
where
[TABLE]
in which stands for the block diagonal matrix with diagonal blocks, , and .
Here, we use the standard finite-element interpolation operators and their transposes for restriction. For , the symbol is well-known [10] while, for the nodal basis for , the symbol is given in [15].
Definition 3.4**.**
The asymptotic two-grid convergence factor, , is defined as
[TABLE]
In what follows, we consider a discrete form of , denoted by , resulting from sampling over only a finite set of frequencies. For simplicity, we drop the subscript throughout the rest of this paper, unless necessary for clarity.
3.2 Fourier representation of discretization operators
3.2.1 Fourier representation of the stabilized discretization
By standard calculation, the symbols of the stiffness and mass stencils are
[TABLE]
respectively. The stencils of the partial derivative operators and are
[TABLE]
respectively, and the corresponding symbols are
[TABLE]
where denotes the conjugate transpose. Thus, the symbols of the stabilized finite-element discretizations of the Stokes equations are given by
[TABLE]
For the Poisson-stabilized discretization, the symbol of is . For the projection stabilized method, following (5), the symbol of is
[TABLE]
For convenience, we write for the last block of Equation (2), and its symbol as in the rest of this paper.
3.2.2 Fourier representation of stable discretizations
The symbols of the stiffness and mass stencils for the discretization using nodal basis functions in 1D are
[TABLE]
respectively [15]. Here, we note that the entries correspond to the symbols associated with basis functions at the nodes of the mesh, while the entries correspond to the symbols associated with cell-centre (bubble) basis functions. The off-diagonal entries express the interaction between the two types of basis functions. Then, the Fourier representation of in 2D can be written as a tensor product,
[TABLE]
The tensor product preserves block structuring; that is, is a matrix, ordered as mesh nodes, -edge midpoints, -edge midpoints, and cell centres. Each row of reflects the connections between one of the four types of degrees of freedom with each of these four types. Similarly, there are four types of stencils for and .
The stencils and the symbols of for the nodal, -edge, -edge, and cell-centre degrees of freedom are
[TABLE]
respectively. Denote .
Similarly to , the symbol of the stencil of can be written as
[TABLE]
Thus, the Fourier representation of the finite-element discretization of the Stokes equations can be written as
[TABLE]
Note that the Fourier symbol for the discretization is a matrix, and that the LFA smoothing factor for the approximation generally fails to predict the true two-grid convergence factor [15, 16]. The same behavior is seen for the relaxation schemes considered here. Therefore, we do not present smoothing factor analysis for this case and only optimize two-grid LFA predictions numerically.
4 Relaxation for discretizations
4.1 DWJ relaxation
Distributive GS (DGS) relaxation [5, 32] is well known to be highly efficient for the MAC finite-difference discretization [10], and other discretizations [33, 44]. Its sequential nature is often seen as a significant drawback. However, Distributive weighted Jacobi (DWJ) relaxation was recently shown to achieve good performance for the MAC discretization [18]. Thus, we consider DWJ relaxation for the finite-element discretizations considered here. The discretized distribution operator can be represented by the preconditioner
[TABLE]
Then, we apply blockwise weighted-Jacobi relaxation to the distributed operator
[TABLE]
where we note that the operators and are formed by taking products of the discrete derivative operators and, thus, do not satisfy the identity .
The discrete matrix form of is
[TABLE]
where is the Laplacian operator discretized at the pressure points. For standard distributive weighted-Jacobi relaxation (with weights ), we need to solve a system of the form
[TABLE]
then distribute the updates as . We use in the block of (12), because the diagonal entries of the block will be of the form of a constant times (up to boundary conditions), for both stabilization terms. The error propagation operator for the scheme is, then, .
The symbol of the blockwise weighted-Jacobi operator, , is
[TABLE]
By standard calculation, the eigenvalues of the error-propagation symbol, , are
[TABLE]
where and
Noting that is very simple, we first consider a lower bound on the optimal LFA smoothing factor corresponding to .
Lemma 4.1**.**
[TABLE]
and this value is achieved if and only if .
Proof.
It is easy to check that for . The minimum of is with or and the maximum is with or . Thus, under the condition . ∎
Remark 4.1**.**
The optimal smoothing factor for damped Jacobi relaxation for the finite-element discretization of the Laplacian is with . Thus, this offers an intuitive lower bound on the possible performance of block relaxation schemes that include this as a piece of the overall relaxation.
From (13), we see that the only difference between the eigenvalues of DWJ relaxation for the Poisson-stabilized and projection stabilized methods is in the third eigenvalue, which depends on and, consequently, on the stabilization term.
4.1.1 Poisson-stabilized discretization with DWJ relaxation
For the Poisson-stabilized case, with and . By standard calculation, , with \big{(}\cos\theta_{1},\cos\theta_{2}\big{)}=(-1,-1), and with \big{(}\cos\theta_{1},\cos\theta_{2}\big{)}=(\frac{8}{17},0) or .
Theorem 4.1**.**
The optimal smoothing factor for the Poisson-stabilized discretization with DWJ relaxation is , that is,
[TABLE]
and is achieved if and only if
[TABLE]
Proof.
\displaystyle\min_{(\alpha_{2},\omega)}\max_{\boldsymbol{\theta}\in T^{{\rm high}}}\bigg{\{}\big{|}1-\frac{\omega}{\alpha_{2}}y_{2}\big{|}\bigg{\}}=\frac{y_{2,\rm{max}}-y_{2,\rm{min}}}{y_{2,\rm{max}}+y_{2,\rm{min}}}=\frac{55}{89} with the condition that . Because , we need to require for all to achieve this factor. It follows that . ∎
4.1.2 Projection stabilized discretization with DWJ relaxation
For the projection stabilized discretization, depends on given in (9), and standard calculation gives with \big{(}\cos\theta_{1},\cos\theta_{2}\big{)}=(-1,-1) and with or .
Theorem 4.2**.**
The optimal smoothing factor for the projection stabilized discretization with DWJ relaxation is , that is,
[TABLE]
and is achieved if and only if
[TABLE]
Proof.
\displaystyle\min_{(\alpha_{2},\omega)}\max_{\boldsymbol{\theta}\in T^{{\rm high}}}\bigg{\{}\big{|}1-\frac{\omega}{\alpha_{2}}y_{2}\big{|}\bigg{\}}=\frac{y_{2,\rm{max}}-y_{2,\rm{min}}}{y_{2,\rm{max}}+y_{2,\rm{min}}}=\frac{65}{97} with the condition that . Since , we need to require for all to achieve this factor, which leads to . ∎
Comparing the Poisson-stabilized and projection stabilized discretizations using DWJ, we see that the optimal LFA smoothing factor for the Poisson-stabilized discretization slightly outperforms that of the projection stabilized discretization. In both cases, a stronger relaxation on the block of (11) would be needed in order to improve performance to match the lower bound on the convergence factor of . A natural approach is to using more iterations to solve the pressure equation in DWJ. We explore the LFA predictions for this case in the following.
4.1.3 Stabilized discretization with 2 sweeps of Jacobi for DWJ relaxation
Denote the block of (11) as . We consider applying two sweeps of weighted-Jacobi relaxation with equal weights, , on the pressure equation. As before, we note that has a constant diagonal entry proportional to , so we write weighted Jacobi relaxation on as for . Thus, we can represent this relaxation scheme as solving
[TABLE]
where \hat{G}=\Big{(}2G_{J}^{-1}-G_{J}^{-1}GG_{J}^{-1}\Big{)}^{-1}. The symbol of , is
[TABLE]
By standard calculation, the eigenvalues of the error-propagation symbol, , are
[TABLE]
where , where the symbol of is , with defined as in (13). Note that has the same eigenvalue, as that of . A natural question is whether \displaystyle\min_{(\alpha_{1},\omega_{J},\omega)}\max_{\boldsymbol{\theta}\in T^{{\rm high}}}\bigg{\{}\big{|}1-\omega y_{3}\big{|}\bigg{\}}=\frac{1}{3}, which is shown in the following theorems.
Theorem 4.3**.**
The optimal smoothing factor for the Poisson-stabilized discretization with 2 sweeps of Jacobi for DWJ relaxation is , that is,
[TABLE]
and is achieved if and only if and either
[TABLE]
or
[TABLE]
Proof.
Recall that , where . Let
[TABLE]
We first show that under some conditions on the parameters, and . Let and be the maximum and minimum of , respectively. If , then it must be that
[TABLE]
Next, we need to find what and are. As discussed earlier, . Thus, , where . Note that is a quadratic function with the axis of symmetric, . Thus, the extreme values of are achieved at the points , or 1. Based on and , we consider two cases.
If , we have
[TABLE]
Note that (19) indicates that . Combining with (20) leads to . However, . Thus, there is no such that in this case. 2. 2.
To guarantee that , we require that . Assume that . It follows that . Recall that .
- (a)
If , we have
[TABLE]
Then, the extreme values of are
[TABLE]
Substituting (22) in to (19), we have
[TABLE]
To guarantee (23) makes sense, in combination with (21) gives
[TABLE]
Recall that there is another eigenvalue, , of . In order to obtain , we thus require
[TABLE]
- (b)
A similar argument holds if , leading to the second set of conditions.
∎
Note that the set of parameters values defined in Theorem 4.3 is not empty, with parameters and in the set.
Theorem 4.4**.**
The optimal smoothing factor for the projection stabilized discretization with two sweeps of Jacobi for DWJ relaxation is , that is,
[TABLE]
and is achieved if and only if and either
[TABLE]
or
[TABLE]
Proof.
The proof is similar to that of Theorem 4.3. ∎
Remark 4.2**.**
Theorems 4.3 and 4.4 tell us that two sweeps of weighted-Jacobi relaxation on the pressure equation in DWJ are required to achieve optimal performance. This is different than the case of DWJ for the MAC discretization [18], where the optimal convergence factor of is attained with one sweep of relaxation on the pressure equation.
Remark 4.3**.**
Red-black Gauss-Seidel relaxation [10] is an attractive tool for parallel computation as it typically offers better relaxation properties while retaining parallelism. However, due to the added coupling of the finite-element operators considered here, four-colour or nine-colour relaxation would be needed to decouple the updates. Thus, we restrict ourselves to weighted Jacobi relaxation.
4.2 Braess-Sarazin relaxation
Although DWJ relaxation is efficient, we see clearly in the above that it “underperforms” in relation to weighted Jacobi relaxation for the scalar Poisson problem unless additional work is done on the pressure equation. Furthermore, proper construction of the preconditioner, , is not always possible or straightforward, especially for other types of saddle-point problems. Considering these obstacles, we also analyse other block-structured relaxation schemes. Braess-Sarazin-type algorithms were originally developed as a relaxation scheme for the Stokes equations [19], requiring the solution of a greatly simplified but global saddle-point system. The (exact) BSR approach was first introduced in [19], where it was shown that a multigrid convergence rate of can be achieved, where denotes the number of smoothing steps on each level. As a relaxation scheme for the system in (2), one solves a system of the form
[TABLE]
where is an approximation to , the inverse of which is easy to apply, for example . Solutions of (25) are computed in two stages as
[TABLE]
where , and is a chosen weight for to obtain a better approximation to . We consider an additional weight, , for the global update, , to improve the effectiveness of the correction to both the velocity and pressure unknowns.
There is a significant difficulty in practical use of exact BSR because it requires an exact inversion of the approximate Schur complement, , which is typically very expensive. A broader class of iterative methods for the Stokes problem is discussed in [21], which demonstrated that the same performance can be achieved as with exact BSR when the pressure correction equation is not solved exactly. In practice, an approximate solve is sufficient for the Schur complement system, such as with a few sweeps of weighted Jacobi relaxation or a few multigrid cycles. In what follows, we take and analyze exact BSR; to see what convergence factor can be achieved. In numerical experiments, we then consider whether it is possible to achieve the same convergence factor using an inexact solver. Note that some studies [3, 8, 45] have shown the efficiency of inexact Braess-Sarazin relaxation. The symbol of is given by
[TABLE]
The symbol of the error-propagation matrix for weighted exact BSR is . A standard calculation shows that the determinant of is
[TABLE]
We first establish a lower bound on the LFA smoothing factor for the stabilized method with BSR.
Theorem 4.5**.**
The optimal LFA smoothing factor for the Poisson-stabilized and projection stabilized discretizations with exact BSR is not less than .
Proof.
From (27), two eigenvalues of are given by
[TABLE]
which are independent of the stabilization term, . From Lemma 4.1, we know that for , the optimal smoothing factor is , under the condition that . Note that if , then . Because there is another eigenvalue, , the optimal LFA smoothing factor is not less than . ∎
Similarly to DWJ, we see that the Jacobi relaxation for the Laplacian discretization places a limit on the overall performance of BSR. From (27), the third eigenvalue of is , where (because both and are imaginary). Thus, we only need to check whether we can choose and so that over all high frequencies, while also ensuring and .
Theorem 4.6**.**
The optimal smoothing factor for both the Poisson-stabilized and projection stabilized discretizations with exact BSR is
[TABLE]
if and only if
[TABLE]
Proof.
Note that , and choose such that . If is positive, the following always holds
[TABLE]
Furthermore, if , we have
[TABLE]
For both discretizations, we can check that over the high frequencies. From (28), it is easy to see that , with . ∎
4.3 Inexact Braess-Sarazin relaxation
Here, we also consider solving the Schur complement equation, (26), by weighted Jacobi relaxation with weight, . Following [21], we refer to this as inexact Braess-Sarazin relaxation (IBSR). Let the corresponding block preconditioner be , given by
[TABLE]
where is the approximation of used in (26). For one sweep of weighted Jacobi relaxation, is given by
[TABLE]
and for 2 sweeps of weighted Jacobi relaxation with equal weights, is given by
[TABLE]
By direct computation, can be written in terms of a stencil:
[TABLE]
The symbol of is for . In fact, . Let be the symbol of ,
[TABLE]
Similarly, let be the symbol of ,
[TABLE]
where
[TABLE]
Finally, the symbol of is given by
[TABLE]
The symbol of the error-propagation matrix for IBSR is . A standard calculation shows that the determinant of is
[TABLE]
From (31), we see there is an eigenvalue , which is the same as that of exact BSR. As before, the question now becomes whether there is a choice of and such that convergence equal to that of exact BSR can be achieved. We leave this as an open question for future work and, instead, numerically optimize the two-grid convergence factor over these parameters.
Remark 4.4**.**
A similar form to (30) occurs for inexact BSR applied to the stable approximation, modifying the stencil of to be zero, and accounting for the block structure shown in (10).
4.4 Numerical experiments for stabilized discretizations
We now present LFA predictions, validating DWJ, (I)BSR, and the related Uzawa iteration against measured multigrid performance for these schemes. We consider the homogeneous problem in (1), with periodic boundary conditions, and a random initial guess, .
Convergence is measured using the averaged convergence factor, , with , and . The LFA predictions are made with , for both the smoothing factor, , and two-grid convergence factor, . For testing, we use standard cycles with bilinear interpolation for variables and biquadratic interpolation for variables, and their adjoints for restriction. We consider both rediscretization and Galerkin coarsening, noting that they coincide for all terms except the stabilization terms that include a scaling of . The coarsest grid is a mesh with 4 elements. Where significant differences arise, we also report two-grid convergence rates for cycles.
4.4.1 PoSD with DWJ
From the range of parameters allowed in (14), we select , and (for convenience, satisfying the equality in (14)) to compute the LFA predictions. Figure 2 shows the spectrum of the two-grid error-propagation operators for DWJ relaxation with rediscretization and Galerkin coarsening. Note that the two-grid convergence factor is the same as the optimal smoothing factor for rediscretization, but not for Galerkin coarsening.
In order to see the sensitivity of performance to parameter choice, we consider the two-grid LFA convergence factor with rediscretization coarsening. From (14), we know that there are many optimal parameters. To fix a single parameter for DWJ, we consider the case of and, at the left of Figure 3, we present the LFA-predicted two-grid convergence factors for DWJ with variation in and . Here, we see strong sensitivity to “too small” values of both parameters, for and , including a notable portion of the optimal range of values predicted by the LFA smoothing factor. At the right of Figure 3, we fix and vary and . The two lines are the lower and upper bounds from (14), between which LFA predicts the optimal convergence factor should be achieved. Note that not all of the allowed parameters obtain the optimal convergence factor. Here, we see great sensitivity for large values of , but a large range with generally similar performance as in the optimal parameter case.
In Table 1, we present the multigrid performance of DWJ with -cycles for rediscretization coarsening. These results show measured multigrid convergence factors that coincide with the LFA-predicted two-grid convergence factors. Similar results are seen for -cycles with rediscretization. For Galerkin coarsening, nearly identical -cycle results are seen when , but divergence is seen for -cycles with or 2, and for all -cycles tested. In Table 2, we report the multigrid performance of DWJ using 2 sweeps of Jacobi relaxation on the pressure equation with rediscretization for PoSD. Here, we take as in Theorem 4.3. We see that the LFA convergence factor accurately predicts the measured performance.
4.4.2 PrSD with DWJ
From the range of parameters allowed in (15), we choose . Figure 4 shows that the smoothing factor provides a good prediction for the two-grid convergence factor with rediscretization, but not with Galerkin coarsening.
Similarly to the discussion above, we consider the sensitivity to parameter choice for DWJ applied to PrSD. To fix a single parameter for DWJ, we consider the case of . At the left of Figure 5, we present the LFA-predicted convergence factors for DWJ with variation in and , again seeing a strong sensitivity to “too small” values of the parameters. At the right of Figure 5, we fix . The two lines are the lower and upper bounds from (15), between which LFA predicts the optimal convergence factor should be achieved. Note that not all of the parameters in this range obtain the optimal convergence factor. We see that, for small , the convergence factor is very sensitive to large values of .
In Table 3, we present the multigrid performance of DWJ relaxation with -cycles for rediscretization coarsening. We see that the measured multigrid convergence factors match well with the LFA-predicted two-grid convergence factors. For Galerkin coarsening, as in the case of PoSD, we see divergence when , but performance matching that of rediscretization for . Here, -cycle results are similar to the -cycle results for both rediscretization and Galerkin coarsening approaches. In Table 4, we compare the LFA predictions with multigrid performance for DWJ using 2 sweeps of Jacobi relaxation on the pressure equation. Here, we take as in Theorem 4.4, and observe a good match between the LFA predictions and measured performance.
4.4.3 PoSD with BSR
Next, we consider BSR for PoSD, first displaying the two-grid LFA convergence factor as a function of for rediscretization coarsening with in Figure 6. Comparing the convergence factor with , for , we see a good match over the interior of the interval predicted by Theorem 4.6. For larger values of , this agreement deteriorates as is typical when the behavior of coarse-grid correction becomes dominant. At the right of Figure 6, we see good agreement between and when with fixed . In both cases, similar behaviour is seen with Galerkin coarsening.
Motivated by the above, we use and for multigrid experiments with rediscretization, solving the Schur complement equation exactly. Table 5 shows that the measured multigrid convergence factors match well with the LFA-predicted two-grid convergence factors for -cycles with rediscretization coarsening, and similar results are seen for Galerkin coarsening.
In Table 6, we report the LFA prediction for IBSR with different parameters and one or two sweeps of Jacobi relaxation on the approximate Schur complement for PoSD. Here, we clearly see the benefit of two sweeps of relaxation on the approximate Schur complement over a single sweep, as well as that better performance is possible when (numerically) optimizing all of the parameters for IBSR independent of the optimization for exact BSR.
Table 7 shows that the LFA-predicted 2-grid convergence factors closely match those seen in practice. However, as shown in Table 8, significant degradation is seen when considering W-cycles, particularly as increases.
The gap between the results seen for exact BSR in Table 5 and those for IBSR in Table 8 is quite significant. To maintain the performance observed for exact BSR, we could simply use more Jacobi iterations on the Schur complement system in IBSR; however, experiments showed that this did not lead to a scalable algorithm. Instead, we consider solving the Schur complement system by applying a multigrid -cycle using weighted Jacobi relaxation with weight , shown in Table 9. From Table 9, we observe that using only 1 or 2 -cycles on the approximate Schur complement achieves convergence factors essentially matching those in Table 5, showing that the cycle is the most cost effective.
4.4.4 PrSD with BSR
We now consider BSR for PrSD. At the left of Figure 7, we see a good agreement between the two-grid convergence factor and for for some parameters in the range defined in Theorem 4.6 when using rediscretization. A larger interval of agreement is seen for the corresponding results for Galerkin coarsening. In both cases, agreement between the two-grid convergence factor and degrades as increases, as expected.
Note that Theorem 4.6 demonstrates that the smoothing factor for BSR is a function of (but the same is not necessarily true for the convergence factor). In Figure 7, we plot the LFA smoothing and convergence factors for BSR with rediscretization as a function of , with and see that these factors generally agree, although the smoothing factor slightly underestimates the convergence factor. As two-grid convergence is, however, sensitive to the choice of , the smoothing factor generally underestimates the convergence factor for other values of .
Fixing with (as suggested by Figure 7 for ), Table 10 shows that the measured multigrid convergence factors again match well with the LFA-predicted two-grid convergence factors for -cycles with rediscretization coarsening. Note, however, the degradation for , where the smoothing factor analysis predicts a convergence factor of that is not realized. The convergence factor of can be achieved by choosing and in the BSR scheme with either or cycles, but these choices lead to a slight degradation with . Similar results are seen for Galerkin coarsening with and with the notable exception that the smoothing factor prediction was matched by both the two-grid LFA convergence factor and true -cycle convergence in this case for all experiments.
In Table 11, we report the LFA prediction for IBSR with one or two sweeps of Jacobi relaxation on the approximate Schur complement with different parameters for PrSD. As in the case of PoSD, one sweep is not enough to obtain performance comparable to exact BSR, and there is a significant advantage to independently optimizing the parameters for IBSR.
Considering, then, the two-grid method with these optimized parameters and two relaxation sweeps on the approximate Schur complement, Table 12 shows that two-grid LFA offers a good prediction of performance. In Table 13, however, we see degraded performance when using -cycles.
Thus, we again consider solving the Schur complement system by applying a multigrid -cycle. Table 14 shows that this IBSR is seen to be effective, requiring 1 to 4 cycles on the Schur complement system to match the convergence seen in Table 10. Again, cycles seem to be the most cost effective option for the approximate Schur complement.
4.5 Stabilized discretizations with Uzawa relaxation
Multigrid methods using Uzawa relaxation schemes [6, 26, 27] are popular approaches due to their low cost per iteration. We consider Uzawa relaxation as a simplification of BSR, determining the update as the (weighted) solution of
[TABLE]
where is an approximation to and is an approximation of the Schur complement, .
Here, we consider an analogue to exact BSR with . The choice of is discussed later. In this setting, we observe that minimizing the LFA smoothing factor does not minimize the LFA convergence factor. Thus, we consider minimizing the two-grid convergence factor numerically for and with rediscretization coarsening, and compare with measured multigrid performance.
We consider three approximations to the Schur complement, starting from the true approximate Schur complement, . Motivated by the stable finite-element case, we also consider replacing in this matrix by a weighted mass matrix, yielding . Finally, motivated by the finite-difference case and efficiency of implementation, we consider taking , for a scalar weight, , to be optimized by the LFA. Note that, due to the constant-coefficient stencils assumed by LFA, this corresponds to using a single sweep of Jacobi to approximate solution of either of the two above approximations.
For the case of , the optimized LFA two-grid convergence factors for with rediscretization coarsening are 0.428 for PoSD and 0.436 for PrSD. These are notably worse than the BSR smoothing factor of , which is achieved for or cycles. Here, cycles reflect this convergence, achieving measured convergence factors of 0.417 for PoSD and 0.526 for PrSD. Increasing the number of relaxation sweeps per iteration yields some improvement in the predicted LFA convergence factors when optimizing parameters again, but not enough to outperform repeated cycles.
For the mass-matrix-based approximation, , the optimized two-grid convergence factors for with rediscretization coarsening are 0.5 for PoSD and 0.417 for PrSD. While poorer convergence might be expected in both cases, the addition of an extra parameter, , allows a (slight) improvement for PrSD. In both cases, we observe consistent performance with numerical experiments, achieving convergence factors of 0.493 for PoSD and 0.392 for PrSD using or cycles.
Finally, for the diagonal approximation , we achieve notably better performance optimizing with than for . For PoSD, the optimized two-grid LFA convergence factor is 0.382, while it is 0.497 for PrSD. In practice, we achieve slightly worse convergence factors using cycles with rediscretization coarsening, of 0.531 for PoSD and 0.543 for PrSD. These are both significantly worse than the convergence factors of observed using inexact BSR; however, it must be noted that -cycles on the Schur complement system were needed in that case. A better approximation to inverting the true approximate Schur complement would be to apply multigrid to it, just as was done for IBSR above. Here, we observe that significant work may be needed to achieve convergence similar to that of Uzawa where the Schur complement is exactly inverted, requiring 10 -cycles on the approximate Schur complement to achieve a convergence factor of 0.416 for PoSD and 0.522 for PrSD, suggesting that the Jacobi version of Uzawa is ultimately more efficient.
4.6 Comparing cost and performance
For convenience, we denote standard DWJ as DWJ(1) and DWJ with 2 sweeps of Jacobi relaxation on the pressure equation as DWJ(2) in the following.
The above results give a clear comparison of the effectiveness of the multigrid cycles with the considered relaxation schemes, but not of their relative efficiencies. To translate from effectiveness to efficiency, we must properly account for the cost per iteration of each relaxation scheme. All schemes assume the residual is already calculated; for the 9-point stencils in , , and stabilization terms, , the cost of a single residual evaluation on a mesh with points is (roughly) that of multiply-add operations, coming from the 7 nonzero blocks in the matrix. For DWJ(1), the rest of the cost of relaxation is fairly easy to account, requiring one diagonal scaling operation on each of the three components of the solution vector, plus matrix-vector products with the pressure Laplacian, , and with both and . Counting multiply-add operations for these on a grid with points, we have for the diagonal scalings, and each for the multiplication with and with and and their transposes, totalling multiply-add operations. For DWJ(2), we need 48n multiply-add operators plus the cost of the second sweep. For the second sweep, we need to compute a residual related to , the block of (11), and a diagonal scaling. Note that the cost of the residual is ( each for the multiplication with , , and with and and their transposes). In total, the cost of DWJ(2) is multiply-add operators. For IBSR, following (26), we require two diagonal scaling operations on each of the velocity components, one matrix-vector product with each of and , and 2 or 3 W(1,1) cycles on the pressure variable. To account for the costs of the W(1,1) cycles, we use the standard cost estimate for W-cycles, as requiring 4 “Work Units” per iteration, where a Work Unit is the cost of forming a residual for the pressure equation. Here, given the 25-point stencil structure seen in Equation (29), each Work Unit requires multiply-add operations, so the total cost of IBSR with 2 W(1,1) cycles on the Schur complement is multiply-add operations (and multiply-add operations if 3 W(1,1) cycles are needed). Finally, Uzawa relaxation with diagonal scaling on the pressure has a cost less than that of DWJ(1), as it requires diagonal scaling again for all three components of the solution, but only one matrix-vector multiplication, with . These total multiply-add operations.
Accumulating the costs of a residual evaluation with these, we have total costs of multiply-add operations per sweep of DWJ(1), multiply-add operations per sweep of DWJ(2), multiply-add operations per sweep of IBSR with 2 W(1,1) cycles per Schur-complement solve, and multiply-add operations per sweep of Uzawa with diagonal scaling. Considering these relative to one-another, we see that DWJ(1) has a cost of about 4/3 per cycle as Uzawa, that DWJ(2) has a per-cycle cost of about 2 times that of Uzawa, and 1.5 times that of DWJ(1), that IBSR has a per-cycle cost of about 3.6 times that of Uzawa, 2.7 and 1.8 times that of DWJ(1) and DWJ(2), respectively. The per-cycle convergence factors observed above are 0.35 per cycle for W(1,1) cycles of DWJ(1) for PoSD and 0.44 per cycle for W(1,1) cycles of DWJ(1) for PrSD, 0.11 per cycle for W(1,1) cycles of DWJ(2) and IBSR for both stabilizations, and 0.53 or 0.54 per cycle for W(1,1) cycles with Uzawa. Comparing efficiencies can now be easily done by appropriately weighting these convergence factors relative to their work: if one iteration costs times that of another, and yields a convergence factor of , then we can easily compare directly to the second convergence factor, , to see if the effective error reduction achieved by the first algorithm in an equal amount of work to the second is better or worse than that achieved by the second. Comparing DWJ(1) to Uzawa, then, for PoSD, we compare to and see that DWJ(1) is more efficient. For PrSD, we compare and see that DWJ(1) and Uzawa are similarly efficient for PrSD. Comparing DWJ(2) to Uzawa, we compare to 0.53(0.54), showing that DWJ(2) is much efficient than Uzawa. Comparing DWJ(2) to DWJ(1), we compare to 0.35 (0.45) and see DWJ(2) is more efficient than DWJ(1). Comparing IBSR to Uzawa, we compare , and see that it as also comparable in efficiency to the others for the case of PrSD, but slightly less efficient than DWJ(1) for PoSD. DWJ(2) and IBSR have the same per-cycle convergence factor, but the cost of DWJ(2) () is less than IBSR (). Thus DWJ(2) is more efficient than IBSR. Overall, DWJ(2) outperforms Uzawa, IBSR and DWJ(1). We note that these results are a little different than those seen for the MAC discretization in [18], where IBSR outperforms other schemes. Differences seen in practice (and the influence of factors ignored in the LFA, such as boundary conditions) are important to consider.
An important practical consideration commonly observed in the LFA literature (see, for example, [18, 35]) is the influence of boundary conditions. In numerical experiments not shown here, we often see significant degradation in convergence between the results for periodic boundary conditions and those for Dirichlet boundary conditions, particularly for cases with larger numbers of relaxation sweeps per cycle. For DWJ(2) with , changing from periodic to Dirichlet boundary conditions results in convergence factors increasing from 0.324 reported in Tables 2 and 4 to about 0.46 (PrSD) or 0.56 (PoSD) for two-grid cycles, and to about 0.64 (PrSD) and 0.7 (PoSD) for W-cycles. For IBSR with , however, the degradation is much less, with W-cycle convergence rates of 0.38 for PoSD (still with 2 inner W(1,1)-cycles for the Schur complement system) and 0.35 for PrSD (with , , and 4 W(1,1) cycles with for the Schur complement system). Clearly this difference in performance is enough to change the balance above, with the added cost of IBSR with inner W-cycles paying off over DWJ(2).
5 Relaxation for discretization
As explored in [15], classical LFA smoothing factor analysis is unreliable for discretizations, making it unsuitable for analysis of the standard stable discretization of the Stokes equations. Thus, we consider only numerical (“brute force”) optimization of two-grid LFA convergence factors in this setting.
For DWJ, we find optimal convergence factors of 0.619 for and 0.558 for . While the former is quite comparable to convergence predicted and achieved for both stabilized discretizations with , we see a significant lack of improvement with increased relaxation, in contrast to the equal-order case. The same is observed for multigrid -cycle performance, with convergence measured at 0.620 and convergence measured at 0.510.
For exact BSR, we find optimal convergence factors of 0.551 for and 0.250 for . While these are slightly larger than the comparable factors of and , respectively, for the stabilized discretizations, they still reflect good performance of the underlying method.
At left of Figure 8, we show the spectral radius of the error-propagation symbol for exact BSR as a function of Fourier frequency, , noting that predicted reduction over the high frequencies is not as good as would be needed to equal two-grid convergence in the equal-order case. In order to see how the convergence factor changes with the parameters and , we display the convergence factor as a function of and at the right of Figure 8. The optimal choice, of and , occurs in a narrow band of values, but larger range of values lead to reasonable results.
As always, an inexact solve of the Schur complement system is needed to yield a practical variant of BSR. While 2 sweeps of Jacobi appears sufficient to achieve scalable -cycle convergence when (see Table 15), we find 3 sweeps are needed to achieve convergence factors of 0.240 (see Table 16), in contrast to results in [21] and for the equal-order discretizations considered here, where a much stronger iteration was needed. Similar results were seen for cycles when 3 sweeps of Jacobi were used for the Schur complement system.
Finally, we consider the same three variants of Uzawa relaxation as examined above for the equal-order case. For , the best convergence factor found for was 0.729, while better convergence was predicted for , with factor 0.554. This is to be expected, perhaps, since the mass matrix is well-known to be a better approximation of the true Schur complement than the classical BSR approximate Schur complement. However, approximating either by a single sweep of Jacobi, yielding , gives a convergence factor 0.717. While 2-grid cycles with match the predicted convergence factor, -cycles did not converge for these parameters.
Comparing, then, the efficiency of inexact BSR and DWJ for the discretization, we see that inexact BSR, where cycles achieve a convergence factor of 0.24 provides roughly the same reduction as 3 cycles with 1 DWJ sweep per cycle, where LFA predicts . Noting that inexact BSR is relatively more expensive in this case, with cost dominated by the two diagonal scalings per sweep on the velocity degrees of freedom, we suggest a proper implementation study is needed to determine which, if either, provides best performance in practice.
6 Conclusion
In this paper, LFA is presented for block-structured relaxation schemes for stabilized and stable finite-element discretizations of the Stokes equations. The convergence and smoothing factors exhibited here provide optimized parameters for DWJ with one or two sweeps of Jacobi relaxation on the pressure equation and BSR for the stabilized discretizations. The convergence of (inexact) BSR clearly outperforms multigrid with both standard DWJ and Uzawa relaxation. However, standard DWJ can be improved by additional relaxation on the pressure equation, and the improved version is more efficient than IBSR. While the LFA smoothing factor loses its predictivity of the two-grid convergence factor for the stable discretization and for Uzawa relaxation for both stabilized and stable discretizations, the two-grid LFA convergence factor can still provide useful predictions. We consider as well the inexact case for BSR, with Jacobi iterations or multigrid cycles used to approximate solution of the Schur complement system, as is suitable for use on modern parallel and graphics processing unit (GPU) architectures. From numerical experiments, we see that inexact BSR can be as good as the exact iteration for solving the Stokes equations. The analysis and LFA predictions demonstrated here offer good insight into the use of block-structured relaxation for other types of saddle-point problems, which will be considered in future work.
Acknowledgements
The work of S.M. was partially supported by an NSERC discovery grant.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] H. C. Elman, D. J. Silvester, A. J. Wathen, Finite elements and fast iterative solvers with applications in incompressible fluid dynamics, 2nd Edition, Numerical Mathematics and Scientific Computation, Oxford University Press, Oxford, 2014.
- 2[2] J. W. Ruge, K. Stüben, Algebraic multigrid, Multigrid methods 3 (13) (1987) 73–130.
- 3[3] J. Adler, T. R. Benson, E. Cyr, S. P. Mac Lachlan, R. S. Tuminaro, Monolithic multigrid methods for two-dimensional resistive magnetohydrodynamics, SIAM J. Sci. Comput. 38 (1) (2016) B 1–B 24.
- 4[4] J. H. Adler, D. B. Emerson, S. P. Mac Lachlan, T. A. Manteuffel, Constrained optimization for liquid crystal equilibria, SIAM J. Sci. Comput. 38 (1) (2016) B 50–B 76.
- 5[5] A. Brandt, N. Dinar, Multigrid solutions to elliptic flow problems, in: Numerical methods for partial differential equations, Vol. 42 of Publ. Math. Res. Center Univ. Wisconsin, Academic Press, New York-London, 1979, pp. 53–147.
- 6[6] M. A. Olshanskii, Multigrid analysis for the time dependent Stokes problem, Math. Comp. 81 (277) (2012) 57–79.
- 7[7] J. Schöberl, W. Zulehner, On Schwarz-type smoothers for saddle point problems, Numer. Math. 95 (2) (2003) 377–399.
- 8[8] J. H. Adler, T. R. Benson, S. P. Mac Lachlan, Preconditioning a mass-conserving discontinuous Galerkin discretization of the Stokes equations, Numer. Linear Algebra Appl. 24 (3) (2017) e 2047, 23.
