Spectral analysis and multigrid preconditioners for two-dimensional space-fractional diffusion equations
Hamid Moghaderi, Mehdi Dehghan, Marco Donatelli, Mariarosa Mazza

TL;DR
This paper develops spectral analysis and multigrid preconditioners for 2D space-fractional diffusion equations, enabling efficient iterative solutions by exploiting Toeplitz structure and proving linear convergence.
Contribution
It introduces a spectral analysis framework for 2D space-FDEs and designs multigrid preconditioners that ensure fast, robust iterative solutions with proven convergence rates.
Findings
Spectral analysis of coefficient matrices guides preconditioner design.
Multigrid methods achieve linear convergence rates.
Preconditioned Krylov methods maintain efficiency with new strategies.
Abstract
Fractional diffusion equations (FDEs) are a mathematical tool used for describing some special diffusion phenomena arising in many different applications like porous media and computational finance. In this paper, we focus on a two-dimensional space-FDE problem discretized by means of a second order finite difference scheme obtained as combination of the Crank-Nicolson scheme and the so-called weighted and shifted Gr\"unwald formula. By fully exploiting the Toeplitz-like structure of the resulting linear system, we provide a detailed spectral analysis of the coefficient matrix at each time step, both in the case of constant and variable diffusion coefficients. Such a spectral analysis has a very crucial role, since it can be used for designing fast and robust iterative solvers. In particular, we employ the obtained spectral information to define a Galerkin multigrid method based on…
| GMRES | ||||||
| 9.000 | ||||||
| 9.000 | ||||||
| 9.000 | ||||||
| 10.000 |
| GMRES | |||||
| GMRES | |||||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\marginsize
1.5cm1.5cm1cm2.5cm
Spectral analysis and multigrid preconditioners for two-dimensional
space-fractional diffusion equations
Hamid Moghaderi
Department of Applied Mathematics, Faculty of Mathematics and Computer Science, Amirkabir University of Technology, No. 424, Hafez Ave., 15914, Tehran, Iran
Mehdi Dehghan
Department of Applied Mathematics, Faculty of Mathematics and Computer Science, Amirkabir University of Technology, No. 424, Hafez Ave., 15914, Tehran, Iran
Marco Donatelli
Department of Science and High Technology, University of Insubria, Via Valleggio 11, 22100 Como, Italy
Mariarosa Mazza
Division of Numerical Methods for Plasma Physics, Max Planck Institute for Plasma Physics, Boltzmannstrasse 2, 85748 Garching bei München, Germany
Abstract
Fractional diffusion equations (FDEs) are a mathematical tool used for describing some special diffusion phenomena arising in many different applications like porous media and computational finance. In this paper, we focus on a two-dimensional space-FDE problem discretized by means of a second order finite difference scheme obtained as combination of the Crank-Nicolson scheme and the so-called weighted and shifted Grünwald formula.
By fully exploiting the Toeplitz-like structure of the resulting linear system, we provide a detailed spectral analysis of the coefficient matrix at each time step, both in the case of constant and variable diffusion coefficients. Such a spectral analysis has a very crucial role, since it can be used for designing fast and robust iterative solvers. In particular, we employ the obtained spectral information to define a Galerkin multigrid method based on the classical linear interpolation as grid transfer operator and damped-Jacobi as smoother, and to prove the linear convergence rate of the corresponding two-grid method. The theoretical analysis suggests that the proposed grid transfer operator is strong enough for working also with the V-cycle method and the geometric multigrid. On this basis, we introduce two computationally favourable variants of the proposed multigrid method and we use them as preconditioners for Krylov methods. Several numerical results confirm that the resulting preconditioning strategies still keep a linear convergence rate.
**Keywords: **fractional diffusion equations; CN-WSGD scheme; spectral analysis; GLT theory; multigrid methods
††Email addresses: [email protected]; [email protected] (Hamid Moghaderi), [email protected]; [email protected] (Mehdi Dehghan), [email protected] (Marco Donatelli), [email protected] (Mariarosa Mazza)
1 Introduction
Fractional diffusion equations (FDEs) are a special class of partial differential equations (PDEs) used for describing subdiffusion and superdiffusion phenomena. Among their applications: finance, biology, turbulent flow, and image processing [24, 15, 5, 4]. In more detail, a standard diffusion equation becomes a time-FDE when the first order derivative in time is replaced by a fractional derivative in the Caputo form (see e.g., [23] for a precise definition) and/or a space-FDE when the second order derivatives in space are replaced by the fractional Riemann-Liouville derivatives (to be defined below). In this paper, we deal with space-FDEs and we refer to them simply as FDEs.
Closed-form solutions for FDEs are rarely available, then various numerical discretization methods have been proposed in the last decades. Among them, finite difference schemes have been widely studied. For instance, in 2004 and 2006, Meerschaert and Tadjeran proposed a first order accurate finite difference scheme obtained combining an implicit Euler method in time with a first order approximation of the space derivatives called shifted Grünwald formula [18, 19]. In 2006, together with Scheffler the same authors proposed a finite difference discretization method based on the classical Crank-Nicolson (CN) and the Richardson extrapolation, which guarantee the second order accuracy in time and space, respectively [17]. Both methods were proved to be consistent and unconditionally stable. More recently, in [31] the authors, inspired by the shifted Grünwald difference operator and the CN technique, defined a more general and flexible class of second order accurate methods combining the CN discretization in time with a second order approximation of the Riemann-Liouville fractional derivatives called weighted and shifted Grünwald difference (WSGD). Such methods are briefly referred to as CN-WSGD.
From a numerical linear algebra viewpoint, it is worth noticing that, if one of the previously described finite difference schemes is chosen, then the resulting linear system shows a strong structure and indeed the related coefficient matrices can be seen as a sum of diagonal times Toeplitz matrices (see e.g., [36]). As a consequence, the storage requirement is reduced from to and the complexity of the matrix-vector product from to , where is the mesh-size at each time step.
Recently, several fast iterative solvers which exploit the aforementioned Toeplitz-like structure have been proposed for solving linear systems coming from a finite difference discretization of one-dimensional FDEs. Among them, a circulant preconditioning strategy for Conjugate Gradient for Normal Residual method (CGNR) [16], and an efficient multigrid method [21]. The former has been proven to be superlinearly convergent when the diffusion coefficients are constant, the latter has been shown to be optimal when they are also equal. In paper [8], the authors recognize in the Toeplitz-like structure of the FDEs linear systems a Generalized Locally Toeplitz (GLT) sequence and then use the GLT machinery to study its singular value/eigenvalue distribution as the matrix size diverges. The obtained spectral information is employed to propose two competitive tridiagonal structure preserving preconditioners for Krylov methods and to show that the superlinear convergence of the CGNR method with the circulant preconditioner in [16] cannot be replicated by any Krylov method in the nonconstant coefficient case, while the multigrid in [21] is optimal also for nonconstant coefficients under the only hypothesis that they are uniformly bounded and positive.
Of course, fast solvers for multidimensional FDE problems are of crucial interest. In the two-dimensional setting, we mention the ones proposed in [35, 14, 7]. The first reference deals with an alternating-direction finite difference method for two-dimensional FDEs by fully decoupling the linear systems in -direction and -direction. In [14] a preconditioner defined by the incomplete LU factorization of a block-banded-banded-block approximation of the coefficient matrix is introduced, and the linear systems are solved by a preconditioned Generalized Minimal Residual (GMRES) method and a preconditioned CGNR method. In [7] the authors propose a Locally One Dimensional (LOD) finite difference method for two-dimensional (and three-dimensional) Riesz FDE problem and use the LOD-multigrid method to solve the resulting linear systems. As for the one-dimensional case, also in the multidimensional setting the spectral analysis of the coefficient matrix is crucial to fully understand the behaviour of preconditioned Krylov and multigrid methods when used for solving the associated linear system. However, for multidimensional FDEs such a spectral study is missing even in the case of constant diffusion coefficients.
In this paper, we focus on a two-dimensional FDE problem discretized by means of the CN-WSGD scheme, and we provide a detailed spectral analysis of the resulting coefficient matrix at each time step, both in the constant and variable diffusion coefficients case. The obtained spectral information suggests that classical preconditioners based on multilevel circulant matrices are not suited for solving the involved linear systems, while multigrid methods can be effective and robust solvers. On this basis, we propose a Galerkin multigrid method with classical linear interpolation as grid transfer operator and damped-Jacobi as smoother, and we prove that the corresponding two-grid method has a linear convergence rate. More precisely, we compute the spectral symbol of the involved matrix-sequences both in the constant and variable coefficients case and use this knowledge to formally prove the convergence and the optimality of the two-grid method, i.e., we prove that the number of iterations required to reach a prescribed accuracy depends neither on the discretization step nor on the order of the fractional derivatives. The analysis of the symbol is also used for defining the parameter of the damped Jacobi such that it satisfies the smoothing property [25].
Concerning the V-cycle, its linear convergence rate has been proven only for sequences of matrices in some trigonometric algebras [2]. In spite of this, we show that the linear interpolation operator reveals powerful enough to work also under some perturbations and then we introduce two computationally favourable variants of the proposed multigrid, especially suited in the variable coefficients case. Indeed, in such a case, the setup phase of a multigrid based on the Galerkin approach is computationally too expensive because the structure of the coefficient matrix cannot be preserved, and this prevents a fast computation of the matrix-vector product. Therefore, we propose two preconditioning alternatives:
a Galerkin multigrid applied to a band preconditioner built using the Laplacian; 2. 2)
a geometric multigrid applied to the full coefficient matrix.
The first strategy is computationally very attractive because it involves matrices with only five nonzero diagonals and hence has a linear cost, with a small constant, in . However, its effectiveness reduces when both the fractional derivative orders are far from . On the contrary, the second strategy is reliable for every fractional order, does not need the setup phase and is such that each iteration has a computational cost only about 4/3 times higher than the Jacobi smoother (see [33]). Moreover, it is also well suited as a stand alone solver.
The paper is briefly summarized as follows. In Section 2 we introduce the two-dimensional FDEs equations and recall the CN-WSGD scheme. In Section 3 we firstly describe the notion of symbol and of spectral distribution of a matrix-sequence, as well as the idea behind the GLT. Later, we use these tools to retrieve spectral information on the involved matrices. Such spectral information is then used in Section 4 to study the convergence and the optimality of the proposed multigrid. Finally, Section 5 is devoted to numerical examples and Section 6 contains conclusions and open problems.
2 Problem setting: the CN-WSGD scheme for 2D space-FDEs
In this paper, we are interested in the following initial-boundary value problem of two-dimensional space-FDE
[TABLE]
where is the space domain, and are the fractional derivative orders with respect to and , respectively. The nonnegative functions and are the diffusion coefficients and is the forcing term. The left-sided and the right-sided fractional derivatives in (2.1) are defined in Riemann-Liouville form as follows
[TABLE]
For the discretization of the FDE problem (2.1) we apply the second order accurate CN-WSGD scheme (see [31]). In order to introduce such a scheme, let us fix three positive integers and discretize the domain with
[TABLE]
Let us define
[TABLE]
Using the WSGD formula with shifting parameters , we obtain the following expression for the fractional derivatives
[TABLE]
where , because of the Dirichlet boundary conditions. Here the coefficients are given by
[TABLE]
where are the alternating fractional binomial coefficients defined as
[TABLE]
with the formal notation . For , the fractional binomial coefficients have the following properties
[TABLE]
Similarly, the coefficients satisfy few properties, summarized in the following proposition (see [31]).
Proposition 2.1**.**
Let , be defined as in (2.2). Then the coefficients satisfy the following properties
[TABLE]
The CN-WSGD scheme is obtained combining the CN scheme in time with the WSGD formula for the fractional derivatives and can be written as follows
[TABLE]
where , , and
[TABLE]
with
[TABLE]
Now, we can write the matrix form of the above discretization. First, we need to introduce the following notations. Let and define the following objects:
The -dimensional vectors
[TABLE] 2. 2.
The diagonal matrices
[TABLE] 3. 3.
The Toeplitz matrix
[TABLE] 4. 4.
The matrices
[TABLE]
where denotes the usual Kronecker product.
Thus the CN-WSGD method (2.4) can be written in the following matrix form
[TABLE]
where , , and denotes the identity matrix. Multiplying both sides by , we obtain
[TABLE]
By defining
[TABLE]
the linear system (2.7), which has to be solved at each time step , can be written as
[TABLE]
Let , then the following results are proved in [31]. If , from Proposition 2.1 it follows that , then the coefficient matrix is a strictly diagonally dominant M-matrix. In addition, when the diffusion coefficients are time independent, the CN-WSGD scheme is unconditionally stable and the truncation error is . On the other side, when and , with and nonnegative constants, then the matrix is a symmetric positive definite Block-Toeplitz-Toeplitz-Blocks (BTTB) matrix, cf. Definition 3.1.
3 Spectral analysis of the coefficient matrix
Given the notion of symbol and of spectral distribution in the eigenvalue and singular value sense, in this section we provide a spectral analysis of the coefficient matrix-sequence . In the constant coefficient case, as already observed in other papers (see e.g., [16]), the coefficient matrix-sequence is a BTTB sequence: then using well-known spectral tools for BTTB sequences we determine its symbol and study its spectral distribution. In the nonconstant coefficients case, under appropriate conditions, we show that, belongs to the GLT class and use the GLT machinery to analyse its singular value/eigenvalue distribution. The resulting spectral information will be then used in Section 4 for the analysis and the design of numerical solvers to be applied to the considered linear systems.
3.1 Constant diffusion coefficients case
Let us assume that the diffusion coefficients are constant. Under this condition, is a sequence of BTTB matrices, or equivalently of 2-level Toeplitz matrices according to the following definition.
Definition 3.1**.**
Let and let be the sequence of its Fourier coefficients defined as
[TABLE]
where . Then the -level Toeplitz matrix of partial orders associated with is
[TABLE]
where is the order of the matrix. The function is called the symbol of the matrix-sequence .
To clarify the notation for the case of interest, the BTTB matrix of order associated with is
[TABLE]
or equivalently,
[TABLE]
where are matrices whose entry -th equals if and is [math] elsewhere.
When , i.e., for Toeplitz matrices, we simplify the notation using
[TABLE]
Definition 3.2**.**
The Wiener class is the set of functions
[TABLE]
We determine the sequence of symbols associated to as a corollary of the following proposition.
Proposition 3.3**.**
Let and let be defined as in (2.5). Then the symbol associated to the matrix-sequence belongs to the Wiener class and its formal expression is given by
[TABLE]
Proof.
Let us observe that , with for , and let us define the function
[TABLE]
Then, by Definition 3.1, it holds that . When , it is easy to see that lies in the Wiener class. In detail, from Proposition 2.1 we know that , , for and , and for . Then
[TABLE]
Again from Proposition 2.1, we deduce that
[TABLE]
that is
[TABLE]
The righthand side of the previous relation is a positive constant for , then we can conclude that belongs to the Wiener class. An explicit formula for the symbol can be obtained combining the definition of given in (2.2) with the one of in (2.3) as follows
[TABLE]
Applying the well-known binomial series
[TABLE]
with , the thesis follows. ∎
Corollary 3.4**.**
Let and let us assume that . Then the matrix defined as in (2.8) is the BTTB matrix with
[TABLE]
where and .
Proof.
According to Definition 3.1 and thanks to Proposition 3.3, the BTTB terms of and in equation (2.6) can be written as
[TABLE]
where and .
Finally, recalling that for a real function it holds and replacing equations (2.6) in (2.8), we obtain with defined as in (3.3). ∎
Now we focus our attention on the spectral distribution of , under the further assumption that the diffusion coefficients are equal on both sides. By this hypothesis, is a sequence of symmetric BTTB matrices. Let us start with the definition of the spectral distribution in the sense of the eigenvalues and of the singular values.
Definition 3.5**.**
Let be a measurable function, defined on a measurable set with , . Let be the set of continuous functions with compact support over and let be a sequence of matrices of size with eigenvalues , and singular values , .
- •
* is distributed as the pair in the sense of the eigenvalues, in symbols , if the following limit relation holds for all *
[TABLE]
- •
* is distributed as the pair in the sense of the singular values, in symbols , if the following limit relation holds for all *
[TABLE]
For Hermitian -level Toeplitz matrix-sequences, the following theorem due to Szegö and Tilli holds (see [11, 32]).
Theorem 3.6**.**
Let be a real-valued function, then
[TABLE]
Now, we recall a property of the spectral norm of -level Toeplitz matrices and we state a relevant theorem contained in [12]. Given a square matrix of order , we denote its spectral norm by and its trace norm by . For the spectral norm of a -level Toeplitz sequence generated by it holds (see Corollary 3.5 in [26])
[TABLE]
Theorem 3.7**.**
(Theorem 3.4 in [12]) Let be a matrix-sequence with and Hermitian . Assume that
- •
,
- •
* are bounded by a constant independent of ,*
- •
.
Then .
The following proposition concerns the eigenvalue distribution of the coefficient matrix-sequence when the diffusion coefficients are constant and equal.
Proposition 3.8**.**
Let us assume that , that , and . Let be defined as in (3.1) and define
[TABLE]
Given the matrix-sequence , we have
[TABLE]
where
[TABLE]
is a real-valued continuous function and it is nonnegative for .
Proof.
Since the diffusion coefficients and are constant and equal to real positive numbers and , respectively, the matrices of the sequence (see relation (2.6)) are symmetric. The function
[TABLE]
belongs to the Wiener algebra since itself is in the same algebra (see Proposition 3.3). Furthermore, from its expression, it also follows that is real-valued and globally continuous. Hence, is real-valued and globally continuous. Similarly, the nonnegativity of follows from the nonnegativity of which in turn can be easily derived from the expression of in (3.1) for and .
From Theorem 3.6 with and since , it follows that . Furthermore, using (3.4) with , we have that
[TABLE]
with independent of . Moreover, under the hypothesis that , the remaining term is such that and for some constant independent of . By Theorem 3.7, we conclude that the distribution of is decided only by . ∎
Let us recall that both and are nonnegative functions. Moreover, as a straightforward consequence of Proposition 4 in [8], it is easy to prove that and have a zero at [math] of order and , respectively. With this result in mind, in the following proposition we prove that the superior limit of , with , is bounded as . Such a proposition will be used in Section 4 for proving the constant converge rate of the two-grid and the V-cycle.
Proposition 3.9**.**
Let , with , and let , then there exist two real constants such that
[TABLE]
*where and . *
Proof.
Let us rewrite and in polar form
[TABLE]
where
[TABLE]
for . According to the polar form we have that
[TABLE]
Moreover, we have that
[TABLE]
and
[TABLE]
Note that if and , i.e., , then both limits (3.10) and (3.11) are finite. Moreover, since
[TABLE]
from relation (3.9) it holds that
if , then
[TABLE] 2. 2.
if , then
[TABLE]
It is to convince the reader that does not exists. Indeed, if w.l.o.g. we fix , then along the lines and we get
[TABLE]
where is a positive constant. The equalities are due to the fact that and have a zero at [math] of order and , respectively, with by hypothesis. Therefore, it yields that
[TABLE]
and this, observing that is nonnegative, completes the proof. ∎
3.2 Nonconstant diffusion coefficients case
Now we focus on the symbol associated to and on its spectral distribution, when , , and are nonconstant functions. For this purpose we need the notion of GLT class and the related theory, starting from the pioneering work by Tilli [30] and widely generalized in [28]. In short, the GLT class is an algebra virtually containing any sequence of matrices coming from “reasonable” approximations by local discretization methods (finite differences, finite elements, isogeometric analysis, etc) of partial differential equations (see [10]), containing the multilevel Toeplitz sequences with Lebesgue integrable generating functions. The formal definition is rather technical and involves a heavy notation: therefore we just give and briefly discuss the notion in two dimensions, which is the case of interest in our setting, and we report few properties of the GLT class [10] , which are sufficient for studying the spectral features of the matrices
Since a GLT sequence is a sequence of matrices obtained from a combination of some algebraic operations on multilevel Toeplitz matrices and diagonal sampling matrices, we need the following definition.
Definition 3.10**.**
Given a Riemann-integrable function , by diagonal sampling matrix of order we mean D_{N}(a)={\rm diag}_{{}_{j_{1}=1,\dots,n_{1}\atop j_{2}=1,\dots,n_{2}}}a\bigg{(}\frac{j_{1}}{n_{1}},\frac{j_{2}}{n_{2}}\bigg{)}.
Throughout, we use the following notation
[TABLE]
to say that the sequence is a GLT sequence with symbol .
Here we report four main features of the GLT class in two dimensions.
GLT1
Let with , , then . If the matrices are Hermitian, then it holds also .
GLT2
The set of GLT sequences form a -algebra, i.e., it is closed under linear combinations, products, inversion (whenever the symbol vanishes, at most, in a set of zero Lebesgue measure), conjugation: hence, the sequence obtained via algebraic operations on a finite set of input GLT sequences is still a GLT sequence and its symbol is obtained by following the same algebraic manipulations on the corresponding symbols of the input GLT sequences.
GLT3
Every BTTB sequence generated by an function is such that , with the specifications reported in item . Every diagonal sampling sequence , where is a Riemann integrable function, is such that .
GLT4
Let , , then .
Proposition 3.11**.**
Let us assume that and and that, fixed the instant of time , , , and are Riemann integrable over . For the matrix it holds
[TABLE]
with
[TABLE]
where
[TABLE]
and . Furthermore,
[TABLE]
and whenever , we also have
[TABLE]
with real-valued and indeed all the matrices have only real eigenvalues.
Proof.
Let us observe that, fixed the instant of time , the diagonal elements of the matrices and are a uniform sampling of the functions , and , respectively, with . Therefore, thanks to [GLT3] and to the Riemann integrability of the diffusion coefficients it yields
[TABLE]
Since the GLT class is stable under linear combinations and products [GLT2] and since BTTB sequences with symbols lie in the GLT class [GLT3], it is immediate to see that, under the hypothesis that , the matrix-sequence is still a member of the GLT class and
[TABLE]
where , are defined as in (3.14) and . Moreover, the sequence (under the hypothesis that ) is a GLT sequence with zero symbol, item [GLT4]. This together with [GLT2] and (3.15) implies that . Then by [GLT1] we can conclude and hence , after an affine change of variable.
Now, by exploiting Proposition 3.3 and Proposition 3.8, since is real-valued, it is clear that if then is real-valued. Furthermore, under the hypothesis we deduce that is a positive definite diagonal block matrix, and choosing as the positive definite square root of , we find that is similar to and real symmetric. Therefore, all the eigenvalues of are real and we plainly have , by exploiting again the GLT machinery, as done before but in the Hermitian setting. ∎
Now, let us assume that all the diffusion coefficients are uniformly bounded and positive. Under this hypothesis, the following proposition can be seen as an extension to the nonconstant coefficients case of the result for constant coefficients shown in Proposition 3.9.
Proposition 3.12**.**
Let us assume that . Given as in (3.5) and as in (3.11), the following limit relation holds
[TABLE]
where , , and .
Proof.
As in the proof of Proposition 3.9, we exploit the polar form of and , for , and rewrite and as follows
[TABLE]
[TABLE]
According to the definition of in (3.8), it is easy to see that for and , we have
[TABLE]
Combining relations (3.2) and (3.17), we obtain
[TABLE]
Now, we calculate the limit of the quotient . If , we have
[TABLE]
and then thanks to relations (3.18), (3.2), and (3.20), we conclude that
[TABLE]
If , we obtain an analogous result with , , in place of , , , respectively, and the thesis is proved. ∎
4 Multigrid methods
Multigrid methods have shown to be a valid alternative to preconditioned Krylov methods also for FDEs [21]. Using the Ruge-Stüben theory [25], Theorem 4 in [8] and in [21] for one-dimensional case shows that, in the constant and varying coefficients cases, i.e., and , respectively, the two-grid method converges with a linear convergence rate independent of and . Since in this case the matrix is a BTTB matrix, the classical multigrid theory for BTTB matrices developed in [9, 6, 27, 2] can be directly applied when the symbol is known. Under the assumptions that , , and , according to our previous analysis in Subsection 3.1, the symbol of the BTTB sequence is (cf. Proposition 3.8).
Consider the stationary iterative method
[TABLE]
for the solution of the linear system where , and . Given a full-rank matrix , with , a Two-Grid Method (TGM) is defined by the following algorithm [13].
[TABLE]
Step 6) consists in applying the “post-smoothing iteration” (4.1) times while steps 1) 5) define the “coarse grid correction” which depends on the grid transfer operator . The iteration matrix of TGM is then given by
[TABLE]
It could be possible also to add a “pre-smoothing” iteration in the TGM algorithm before the step 1). We do not add it here for keeping the theoretical analysis as simple as possible. Nevertheless, the addition of a convergent pre-smoother, obviously, cannot deteriorate the convergence of the algorithm but only accelerate the convergence.
Here we want to give a formal proof of convergence and of the optimality of TGM. We recall some convergence results in [25] from the theory of the algebraic multigrid method which are the main theoretical tools for giving our convergence proof. By we denote the Euclidean norm and whenever is positive definite, denotes the energy norm of . Finally if and are Hermitian matrices, then is equivalent to write that is nonnegative definite.
Theorem 4.1**.**
[25]** Let be a positive definite matrix of size and let be defined as in the TGM algorithm. Suppose that independent of such that
[TABLE]
where is the diagonal matrix having the same diagonal of . Assume that independent of such that
[TABLE]
Then and
[TABLE]
When is huge also the coarser problem can be too large to be solved directly. Hence, the point 4) in the TGM should be replaced by a recursive application of the same strategy until we reach a sufficiently small size. A reliable initial guess for the coarser linear system is the zero vector. Indeed, at the coarser levels the linear system to be solved in the error equation and hence the solution goes to zero whenever the whole iteration converges. The resulting algorithm is usually referred as V-cycle.
4.1 Two-grid convergence analysis
The convergence analysis of the TGM is firstly developed in the constant coefficient case, i.e., in the case . The variable coefficient case will be discussed at the end of the subsection.
Let us start showing that damped Jacobi, with a proper choice of the relaxation parameter, satisfies the smoothing property (4.2).
Lemma 4.2**.**
*Let with defined as in (3.5) and let , where is a real number and with the Fourier coefficient of of order zero. Moreover, let us assume that and . If we choose , then such that inequality (4.2) holds true. *
Proof.
By recalling that is nonnegative, it follows that . Hence the relation given in (4.2) is equivalent to the inequality
[TABLE]
which can be rewritten as
[TABLE]
using a congruence transformation with . Since , where and (cf. (2.1)), under the hypothesis that and , we have that is independent of . The inequality (4.4) reads as
[TABLE]
which is implied by the function inequality . From the latter inequality, if then it exists such that equation (4.2) holds and the thesis is proved (see Proposition 3 in [] for more details). ∎
In order to prove the approximation property (4.3), we define the projector as
[TABLE]
where the function will be defined later and is the bidimensional down-sampling operator. Let with , the one-dimensional down-sampling matrix is defined as
[TABLE]
Lemma 4.4 gives the theoretical conditions that has to meet in order to satisfy (4.3). A preliminary proposition is necessary to extend the theoretical analysis usually done matrix algebras, like circulant or matrices (see e.g. [2]) also to (multilevel) Toeplitz matrices.
Proposition 4.3**.**
[29]** Given a matrix , let us denote by the diagonal matrix having the same diagonal of . Moreover, let us assume that the property (4.3) is fulfilled by a matrix-sequence . If there exists another matrix-sequence such that
[TABLE]
[TABLE]
, then property (4.3) is fulfilled by with the same as well.
Lemma 4.4**.**
Let with defined as in (3.5) and for . Moreover, let with the following polynomial
[TABLE]
and assume that and . Then there exists such that relation (4.3) holds true.
Proof.
The proof combines classical results on multigrid methods for Toeplitz matrices with the spectral results in Section 3. According to Proposition 3.8, the function is nonnegative and vanishes only at the origin. Moreover, thanks to Proposition 3.9, the polynomial in (4.7) satisfies the classical condition
[TABLE]
with being the set of the “mirror points” of defined as , see [27] for the derivation of condition (4.8). In particular, the condition (4.8) holds with . Therefore, the TGM defined in the two-level algebra would be convergent.
The result for BTTB matrices follows from Proposition 4.3. Note that is a two-level matrix and for . Hence the projector is exactly the same projector used in TGM defined in the two-levels algebra, cf. [27]. Finally, Proposition 4.3 can be applied replacing the sequences and with the sequences of two-levels and BTTB matrices generated by , respectively. Indeed the condition (4.5) follows from Theorem 7.1 in [27], while the condition (4.6) is a trivial consequence of Proposition 2.1 with . ∎
Remark 4.5**.**
The grid transfer operator associated to the symbol defined as in (4.7) is, up to a constant factor, the classical bilinear interpolation.
By Lemmas 4.2 and 4.4, it follows that there exist and such that inequalities (4.2) and (4.3) hold, respectively. Therefore, by Theorem 4.1, and . The result can be summarized as follows.
Theorem 4.6**.**
Let with defined as in (3.5).
- •
Let , where and with the Fourier coefficient of of order zero (Jacobi smoother).
- •
Let with defined as in (4.7) (bilinear interpolation).
Moreover, let us assume that and . Then the assumptions of Theorem 4.1 are satisfied and it holds
[TABLE]
with a constant independent of , and .
The variable coefficients case can be addressed thanks to the extension of the previous results given in [27]. Let and be four uniformly bounded and positive functions. Then the linear convergence rate of the two-grid method is preserved combining Proposition 3.12 with Lemma 6.2 in [27].
4.2 V-cycle and geometric mulgrid
The convergence analysis of the V-cycle is much more involved and a linear convergence rate has been proven, under a condition stricter than (4.8), only for sequences of matrices in some trigonometric algebras, like the algebra used in the proof of Lemma 4.4, see [2]. In details, the symbol of the grid transfer operator has to satisfy
[TABLE]
With respect to (4.8), the numerator does not have the power two and hence has to vanish at the mirror points with double order.
Remark 4.7**.**
Similarly to Lemma 4.4, for defined as in (4.7) the condition (4.9) holds true with .
The previous remark suggests that the bilinear interpolation is powerful enough to work also under some perturbations. In particular, we could use the geometric multigrid instead of the Galerkin approach. The geometric multigrid defines the coarser matrix rediscretizing the original differential operator on the coarser grid instead of computing by . This saves some computational costs with respect to Galerkin approach which requires the computation of the coarser matrices at each recursion level in a precomputing phase. In particular, in the variable coefficient case the coarser matrices lose the GLT structure in terms of diagonal and BTTB matrices. Therefore we should memorize all the coefficients of the two-level lower Hessenberg coefficient matrices and the computational cost of the precomputing phase is much higher than the arithmetic operations required for the matrix-vector product with the matrix .
In conclusion, the Galerkin approach is useful for the theoretical analysis in Subsection 4.1, but it is not computationally feasible. Therefore, we implement the geometric approach and we denote by one iteration of the geometric multigrid algorithm applied to the linear system with coefficient matrix and defined by the following setting:
- •
V-cycle with a number of recursion levels depending on the size (coarsest grid fixed to ).
- •
The smoother is one step of pre- and post-smoother for the classical Jacobi method (according to Lemma 4.2).
- •
The grid transfer operator is the standard bilinear interpolation and full-weighting corresponding to a proper scaling of the symbol defined as in (4.7) (according to Lemma 4.4).
4.3 Preconditioning
According to the theoretical analysis in Subsection 4.1 and the discussion at the end of Subsection 4.2, our geometric multigrid can be effectively used as a stand alone solver. Nevertheless, in order to increase their robustness, multigrid methods are often applied as preconditioners for Krylov methods. Moreover, the application of multigrid methods as preconditioners allows a simple comparison with existing strategies based on (inverse) band or circulant preconditioners, see e.g., [14, 16, 34].
A further possibility is the combination of a band preconditioner with a multigrid method based on the Galerkin approach. The resulting strategy keeps the computational cost of the precomputing phase to . A simple band preconditioner suitable for Galerkin multigrid methods is the Laplacian preconditioner (). Such preconditioner is inspired by the 1D analysis in [8], where the use of the Laplacian was introduced for fractional derivatives of order greater than 1.5. In this paper, independently of the values of and , we propose the preconditioner
[TABLE]
where , for is the Laplacian matrix. The subscript in recall the order of the derivative used in the preconditioner. Note that the matrix has only five nonzero diagonals. On the other hand, due to fill-in, the direct solution of the linear system with coefficient matrix is not feasible. Therefore, instead of solving the corresponding linear system, we apply only one step of our multigrid consisting of the standard Galerkin approach with Jacobi smoothing and bilinear interpolation as grid transfer operator. Note that Theorem 4.6 holds also with and the resulting preconditioner has a computational cost linear in .
The goodness of this preconditioner depends on the values of and . To define a preconditioner with the same structure of the matrix and keeping at the same time a small bandwidth, the symbol of a banded BTTB matrix has to be a trigonometric polynomial and hence the zero of the symbol cannot be of fractional order. Nevertheless, the condition number of the preconditioned matrix is asymptotical to , with s.t. , and hence the number of iterations of a conjugate gradient type method grows as , see [3]. In conclusion the preconditioner is a good choice whenever or are close to , as confirmed by the numerical results in the next section.
5 Numerical results
In this section we test the effectiveness of our new multigrid preconditioners and defined in the previous section. Since two of the three proposed examples are taken from [14] we compare the performances of our preconditioners with the proposal therein (same Krylov method and same tolerance are used).
In the following tables, according to the notation defined in Section 1, and denote the numbers of spatial partitions in -direction and -direction, respectively, while denotes the number of time steps. In all our examples we simply fix . The infinite norm of the difference between the exact solution and the numerical solution at the last time step is denoted by “”. The number of iterations is computed as the arithmetic average of the number of iterations required for solving (2.9) at each time step . For this reason our multigrid preconditioners will be simply denoted by and . We use the GMRES method to solve the discretized linear systems (2.9). The GMRES method is computationally performed using the built-in gmres Matlab function with tolerance with a restarting every 20 iterations, even if for the preconditioned iterations it is not necessary. The initial guess at each time step is chosen as the zero vector. Our computations are performed using Matlab 7.10 software on a Pentium IV, 3.50 GHz CPU machine with 4 Gbyte of memory.
5.1 Example 1
The first example is taken from [22]. We consider a FDE of type (2.1) with . The nonconstant diffusion coefficients are given by
[TABLE]
The spatial domain is , while the time interval is . The initial condition is
[TABLE]
and the source term is such that the solution to the FDE is given by .
Let , with . Therefore, we have
[TABLE]
Since and we obtain and . Then when , both and tend to zero. As a consequence, the term in goes slowly to zero and for large the term in becomes dominant.
Table 1 compares the iterations and the accuracy of numerical solutions provided by the GMRES without preconditioning and by the GMRES preconditioned with our proposals. For comparison we report also the number of iterations required by the “exact” preconditioners and , where denotes the preconditioner with the direct solution of the resulting linear system and denotes the proposed multigrid applied to the matrix , but using the Galerkin approach instead of the rediscretization. Note that the geometric approach is as effective as the Galerkin approach ( and provide the same number of iterations). On the contrary, one step of V-cycle for the preconditioner is not as good as its direct inversion, i.e., . Nevertheless, it shows a linear convergence rate and it is computationally very cheap since it is a pentadiagonal matrix. A good compromise could be to increase the number of iterations of the preconditioner , for instance, fixing and applying two steps of V-cycle instead of one, the number of iterations reduces from 17 to 14.
5.2 Example 2
This example is taken from [14]. We consider a FDE of type (2.1) with . The nonconstant diffusion coefficients are given by
[TABLE]
The spatial domain is , while the time interval is . The initial condition is
[TABLE]
and the source term is such that the solution to the FDE is given by .
Note that
[TABLE]
Since and we obtain and , i.e., when , goes to zero and tends to infinity. On the other hand, grows very slowly and the numerical results are not affected by this small grow.
In Table 2 we compare the average number of iterations obtained by the GMRES and by our preconditioners with the performances of the preconditioner proposed in [14] and denoted by . Moreover, we show the accuracy of the numerical solution. We observe that in this example the average number of iterations for the preconditioner grows with the algebraic size of the problem. Conversely, our preconditioners show a linear convergence rate with a lower computational cost per iteration, especially the preconditioner which has only five nonzero diagonals and applies only two Jacobi iterations on each grid.
5.3 Example 3
Also this third example is taken from [14] and are the same as in Example 2. The nonconstant diffusion coefficients are given by
[TABLE]
The spatial domain is , while the time interval is . The initial condition is
[TABLE]
and the source term is such that the solution of the FDE is given by .
The behaviour of and is the same of the previous example. The results in Table 3 are comparable to those of the previous example and again our preconditioners lead to a fast convergence with a number of iterations independent of the size .
6 Conclusions
In this paper we have investigated two-dimensional space-FDE problems discretized by means of a second order finite difference scheme obtained as combination of the Crank-Nicolson scheme and the so-called weighted and shifted Grünwald formula. We have provided a detailed spectral analysis of the coefficient matrix by means of its symbol, both in the constant and variable coefficients case. Thanks to the symbol analysis and the theory of multigrid methods for BTTB matrices, we have designed a classical multigrid method particularly effective for the considered problem. Two multigrid preconditioners for GMRES has been proposed and some numerical examples taken from the literature show that they provide a fast convergence independent of the algebraic size of the problem.
Multigrid methods have shown a good scalability in the dimensionality of the problem and they could be a good choice also for 3D and 4D problems since their extension is straightforward. Moreover, they could be effectively applied also to other discretization strategies, like the finite volume method proposed in [20].
Acknowledgment
The work of the last two authors is partly supported by the Italian grant MIUR - PRIN 2012 N. 2012MTE38N and by GNCS-INDAM (Italy).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Aricò, M. Donatelli, A V-cycle Multigrid for multilevel matrix algebras: proof of optimality , Numer. Math., 105 (2007) 511-547.
- 2[2] A. Aricò, M. Donatelli, S. Serra-Capizzano, V-cycle optimal convergence for certain (multilevel) structured linear systems , SIAM J. Matrix Anal. Appl., 26 (2004) 186-214.
- 3[3] O. Axelsson, G. Lindskog, On the rate of convergence of the preconditioned conjugate gradient method , Numer. Math., 48 (1986) 499-523.
- 4[4] J. Bai, X. Feng, Fractional-order anisotropic diffusion for image denoising , IEEE Tran. Image Process., 16 (2007) 2492-2502.
- 5[5] B. A. Carreras, V. E. Lynch, G. M. Zaslavsky, Anomalous diffusion and exit time distribution of particle tracers in plasma turbulence model , Phys. Plasmas, 8 (2001) 5096-5103.
- 6[6] R. H. Chan, Q. Chang, H. W. Sun, Multigrid method for ill-conditioned symmetric Toeplitz systems , SIAM J. Sci. Comput., 19 (1998) 516-529.
- 7[7] M. Chen, Y. Wang, X. Cheng, W. Deng, Second-order LOD multigrid method for multidimensional Riesz fractional diffusion equation , BIT Numer. Math., 54 (2014) 623-637.
- 8[8] M. Donatelli, M. Mazza, S. Serra-Capizzano, Spectral analysis and structure preserving preconditioners for fractional diffusion equations , J. Comput. Phys., 307 (2016) 262-279.
