Spectral analysis and multigrid preconditioners for two-dimensional   space-fractional diffusion equations

Hamid Moghaderi; Mehdi Dehghan; Marco Donatelli; Mariarosa Mazza

arXiv:1706.06844·math.NA·October 11, 2017·J. Comput. Phys.

Spectral analysis and multigrid preconditioners for two-dimensional space-fractional diffusion equations

Hamid Moghaderi, Mehdi Dehghan, Marco Donatelli, Mariarosa Mazza

PDF

TL;DR

This paper develops spectral analysis and multigrid preconditioners for 2D space-fractional diffusion equations, enabling efficient iterative solutions by exploiting Toeplitz structure and proving linear convergence.

Contribution

It introduces a spectral analysis framework for 2D space-FDEs and designs multigrid preconditioners that ensure fast, robust iterative solutions with proven convergence rates.

Findings

01

Spectral analysis of coefficient matrices guides preconditioner design.

02

Multigrid methods achieve linear convergence rates.

03

Preconditioned Krylov methods maintain efficiency with new strategies.

Abstract

Fractional diffusion equations (FDEs) are a mathematical tool used for describing some special diffusion phenomena arising in many different applications like porous media and computational finance. In this paper, we focus on a two-dimensional space-FDE problem discretized by means of a second order finite difference scheme obtained as combination of the Crank-Nicolson scheme and the so-called weighted and shifted Gr\"unwald formula. By fully exploiting the Toeplitz-like structure of the resulting linear system, we provide a detailed spectral analysis of the coefficient matrix at each time step, both in the case of constant and variable diffusion coefficients. Such a spectral analysis has a very crucial role, since it can be used for designing fast and robust iterative solvers. In particular, we employ the obtained spectral information to define a Galerkin multigrid method based on…

Tables3

Table 1. Table 1: Comparison of average number of iterations for the GMRES methods with different preconditioners for Example 5.1 .

$n_{1} = n_{2}$	GMRES	$𝒫_{2, N}^{(m)}$	${\tilde{𝒫}}_{2, N}^{(m)}$	$𝒫_{MGM, N}^{(m)}$	${\tilde{𝒫}}_{MGM, N}^{(m)}$	$E r r o r$
$2^{4}$	$37.000$	$21.000$	9.000	$10.000$	$10.000$	$9.3706 \times 10^{- 2}$
$2^{5}$	$73.000$	$18.781$	9.000	$11.000$	$11.000$	$2.4747 \times 10^{- 2}$
$2^{6}$	$137.000$	$17.000$	9.000	$11.000$	$11.000$	$6.3630 \times 10^{- 3}$
$2^{7}$	$251.000$	$17.000$	10.000	$10.000$	$10.000$	$1.6053 \times 10^{- 3}$

Table 2. Table 2: Comparison of average number of iterations for the GMRES methods with different preconditioners for Example 5.2 . The preconditioner P J L Z subscript 𝑃 𝐽 𝐿 𝑍 P_{JLZ} has been proposed in [ 14 ] .

$n_{1} = n_{2}$	GMRES	$P_{J L Z}$	$𝒫_{2}$	$𝒫_{MGM}$	$E r r o r$
$2^{4}$	$48.750$	$11.000$	$18.063$	$9.000$	$1.1386 \times 10^{- 6}$
$2^{5}$	$81.594$	$12.406$	$15.813$	$9.000$	$3.0206 \times 10^{- 7}$
$2^{6}$	$157.750$	$14.250$	$11.531$	$10.000$	$7.6925 \times 10^{- 8}$
$2^{7}$	$273.914$	$17.055$	$12.000$	$9.891$	$1.9392 \times 10^{- 8}$

Table 3. Table 3: Comparison of average number of iterations for the GMRES methods with different preconditioners for Example 5.3 . The preconditioner P J L Z subscript 𝑃 𝐽 𝐿 𝑍 P_{JLZ} has been proposed in [ 14 ] .

$n_{1} = n_{2}$	GMRES	$P_{J L Z}$	$𝒫_{2}$	$𝒫_{MGM}$	$E r r o r$
$2^{4}$	$36.000$	$11.000$	$17.000$	$9.000$	$1.1486 \times 10^{- 6}$
$2^{5}$	$63.694$	$12.250$	$15.000$	$9.000$	$2.9187 \times 10^{- 7}$
$2^{6}$	$113.234$	$13.813$	$11.000$	$8.000$	$7.4461 \times 10^{- 8}$
$2^{7}$	$173.008$	$15.465$	$11.000$	$8.000$	$1.8782 \times 10^{- 8}$

Equations216

⎩ ⎨ ⎧ \frac{\partial u ( x , y , t )}{\partial t} = d_{+} (x, y, t) \frac{\partial ^{α} u ( x , y , t )}{\partial _{+} x ^{α}} + d_{-} (x, y, t) \frac{\partial ^{α} u ( x , y , t )}{\partial _{-} x ^{α}} + e_{+} (x, y, t) \frac{\partial ^{β} u ( x , y , t )}{\partial _{+} y ^{β}} + e_{-} (x, y, t) \frac{\partial ^{β} u ( x , y , t )}{\partial _{-} y ^{β}} + v (x, y, t), u (x, y, t) = 0, u (x, y, 0) = u_{0} (x, y), (x, y, t) \in Ω \times (0, T], (x, y, t) \in \partial Ω \times [0, T], (x, y) \in \overset{ˉ}{Ω},

⎩ ⎨ ⎧ \frac{\partial u ( x , y , t )}{\partial t} = d_{+} (x, y, t) \frac{\partial ^{α} u ( x , y , t )}{\partial _{+} x ^{α}} + d_{-} (x, y, t) \frac{\partial ^{α} u ( x , y , t )}{\partial _{-} x ^{α}} + e_{+} (x, y, t) \frac{\partial ^{β} u ( x , y , t )}{\partial _{+} y ^{β}} + e_{-} (x, y, t) \frac{\partial ^{β} u ( x , y , t )}{\partial _{-} y ^{β}} + v (x, y, t), u (x, y, t) = 0, u (x, y, 0) = u_{0} (x, y), (x, y, t) \in Ω \times (0, T], (x, y, t) \in \partial Ω \times [0, T], (x, y) \in \overset{ˉ}{Ω},

\frac{\partial ^{α} u ( x , y , t )}{\partial _{+} x ^{α}} = \frac{1}{Γ ( 2 - α )} \frac{d ^{2}}{d x ^{2}} \int_{a_{1}}^{x} \frac{u ( ξ , y , t )}{( x - ξ ) ^{α - 1}} d ξ,

\frac{\partial ^{α} u ( x , y , t )}{\partial _{+} x ^{α}} = \frac{1}{Γ ( 2 - α )} \frac{d ^{2}}{d x ^{2}} \int_{a_{1}}^{x} \frac{u ( ξ , y , t )}{( x - ξ ) ^{α - 1}} d ξ,

\frac{\partial ^{β} u ( x , y , t )}{\partial _{+} y ^{β}} = \frac{1}{Γ ( 2 - β )} \frac{d ^{2}}{d y ^{2}} \int_{a_{2}}^{y} \frac{u ( x , η , t )}{( y - η ) ^{β - 1}} d η,

x_{i} = a_{1} + i h_{x},

x_{i} = a_{1} + i h_{x},

y_{j} = a_{2} + j h_{y},

t^{(m)} = m △ t,

t^{(m - 1/2)} := \frac{t ^{(m)} + t ^{(m - 1)}}{2},

_{L} D_{h_{x}}^{α} u (x_{i}, y_{j}, t^{(m)}) = \frac{1}{h _{x}^{α}} k = 0 \sum i w_{k}^{(α)} u (x_{i - k + 1}, y_{j}, t^{(m)}),

_{L} D_{h_{x}}^{α} u (x_{i}, y_{j}, t^{(m)}) = \frac{1}{h _{x}^{α}} k = 0 \sum i w_{k}^{(α)} u (x_{i - k + 1}, y_{j}, t^{(m)}),

_{L} D_{h_{y}}^{β} u (x_{i}, y_{j}, t^{(m)}) = \frac{1}{h _{y}^{β}} k = 0 \sum j w_{k}^{(β)} u (x_{i}, y_{j - k + 1}, t^{(m)}),

\frac{\partial ^{α} u ( x _{i} , y _{j} , t ^{(m)} )}{\partial _{+} x ^{α}} =_{L} D_{h_{x}}^{α} u (x_{i}, y_{j}, t^{(m)}) + O (h_{x}^{2}),

\frac{\partial ^{α} u ( x _{i} , y _{j} , t ^{(m)} )}{\partial _{+} x ^{α}} =_{L} D_{h_{x}}^{α} u (x_{i}, y_{j}, t^{(m)}) + O (h_{x}^{2}),

\frac{\partial ^{β} u ( x _{i} , y _{j} , t ^{(m)} )}{\partial _{+} y ^{β}} =_{L} D_{h_{y}}^{β} u (x_{i}, y_{j}, t^{(m)}) + O (h_{y}^{2}),

w_{0}^{(γ)} = \frac{γ}{2} g_{0}^{(γ)}, w_{k}^{(γ)} = \frac{γ}{2} g_{k}^{(γ)} + \frac{2 - γ}{2} g_{k - 1}^{(γ)}, k \geq 1,

w_{0}^{(γ)} = \frac{γ}{2} g_{0}^{(γ)}, w_{k}^{(γ)} = \frac{γ}{2} g_{k}^{(γ)} + \frac{2 - γ}{2} g_{k - 1}^{(γ)}, k \geq 1,

g_{k}^{(γ)} = (- 1)^{k} (k γ) = \frac{( - 1 ) ^{k}}{k !} γ (γ - 1) \dots (γ - k + 1), k = 0, 1, \dots,

g_{k}^{(γ)} = (- 1)^{k} (k γ) = \frac{( - 1 ) ^{k}}{k !} γ (γ - 1) \dots (γ - k + 1), k = 0, 1, \dots,

⎩ ⎨ ⎧ g_{0}^{(γ)} = 1, g_{1}^{(γ)} = - γ, j = 0 \sum \infty g_{j}^{(γ)} = 0, g_{j}^{(γ)} = O (j^{- (γ + 1)}), g_{2}^{(γ)} > g_{3}^{(γ)} > \dots > 0, j = 0 \sum n g_{j}^{(γ)} < 0, n \geq 1,

⎩ ⎨ ⎧ g_{0}^{(γ)} = 1, g_{1}^{(γ)} = - γ, j = 0 \sum \infty g_{j}^{(γ)} = 0, g_{j}^{(γ)} = O (j^{- (γ + 1)}), g_{2}^{(γ)} > g_{3}^{(γ)} > \dots > 0, j = 0 \sum n g_{j}^{(γ)} < 0, n \geq 1,

⎩ ⎨ ⎧ w_{0}^{(γ)} = \frac{γ}{2}, w_{1}^{(γ)} = \frac{2 - γ - γ ^{2}}{2} < 0, w_{2}^{(γ)} = \frac{γ ( γ ^{2} + γ - 4 )}{4}, 1 \geq w_{0}^{(γ)} \geq w_{3}^{(γ)} \geq w_{4}^{(γ)} \dots \geq 0, j = 0 \sum \infty w_{j}^{(γ)} = 0, j = 0 \sum n w_{j}^{(γ)} < 0, n \geq 2, w_{j}^{(γ)} = O (j^{- (γ + 1)}) .

⎩ ⎨ ⎧ w_{0}^{(γ)} = \frac{γ}{2}, w_{1}^{(γ)} = \frac{2 - γ - γ ^{2}}{2} < 0, w_{2}^{(γ)} = \frac{γ ( γ ^{2} + γ - 4 )}{4}, 1 \geq w_{0}^{(γ)} \geq w_{3}^{(γ)} \geq w_{4}^{(γ)} \dots \geq 0, j = 0 \sum \infty w_{j}^{(γ)} = 0, j = 0 \sum n w_{j}^{(γ)} < 0, n \geq 2, w_{j}^{(γ)} = O (j^{- (γ + 1)}) .

(1 - δ_{x}^{α, (m)} - δ_{y}^{β, (m)}) u_{i, j}^{(m)} = (1 + δ_{x}^{α, (m - 1)} + δ_{y}^{β, (m - 1)}) u_{i, j}^{(m - 1)} + △ t v_{i, j}^{(m - 1/2)},

(1 - δ_{x}^{α, (m)} - δ_{y}^{β, (m)}) u_{i, j}^{(m)} = (1 + δ_{x}^{α, (m - 1)} + δ_{y}^{β, (m - 1)}) u_{i, j}^{(m - 1)} + △ t v_{i, j}^{(m - 1/2)},

δ_{x}^{α, (m)} = \frac{d _{i, j}^{+, (m)} △ t}{2}_{L} D_{h_{x}}^{α} + \frac{d _{i, j}^{-, (m)} △ t}{2}_{R} D_{h_{x}}^{α}, δ_{y}^{β, (m)} = \frac{e _{i, j}^{+, (m)} △ t}{2}_{L} D_{h_{y}}^{β} + \frac{e _{i, j}^{-, (m)} △ t}{2}_{R} D_{h_{y}}^{β},

δ_{x}^{α, (m)} = \frac{d _{i, j}^{+, (m)} △ t}{2}_{L} D_{h_{x}}^{α} + \frac{d _{i, j}^{-, (m)} △ t}{2}_{R} D_{h_{x}}^{α}, δ_{y}^{β, (m)} = \frac{e _{i, j}^{+, (m)} △ t}{2}_{L} D_{h_{y}}^{β} + \frac{e _{i, j}^{-, (m)} △ t}{2}_{R} D_{h_{y}}^{β},

d_{i, j}^{\pm, (m)} = d_{\pm} (x_{i}, y_{j}, t^{(m)}), e_{i, j}^{\pm, (m)} = e_{\pm} (x_{i}, y_{j}, t^{(m)}) .

d_{i, j}^{\pm, (m)} = d_{\pm} (x_{i}, y_{j}, t^{(m)}), e_{i, j}^{\pm, (m)} = e_{\pm} (x_{i}, y_{j}, t^{(m)}) .

u^{(m)}

u^{(m)}

d_{\pm}^{(m)}

e_{\pm}^{(m)}

v^{(m - 1/2)}

D_{\pm}^{(m)} = diag (d_{\pm}^{(m)}), E_{\pm}^{(m)} = diag (e_{\pm}^{(m)}) .

D_{\pm}^{(m)} = diag (d_{\pm}^{(m)}), E_{\pm}^{(m)} = diag (e_{\pm}^{(m)}) .

A^{\gamma}_{\tilde{N}}=-\left(\begin{array}[]{cccccc}w_{1}^{(\gamma)}&w_{0}^{(\gamma)}&0&\cdots&0&0\\ w_{2}^{(\gamma)}&w_{1}^{(\gamma)}&w_{0}^{(\gamma)}&\ddots&\ddots&0\\ \vdots&w_{2}^{(\gamma)}&w_{1}^{(\gamma)}&\ddots&\ddots&\vdots\\ \vdots&\ddots&\ddots&\ddots&\ddots&\vdots\\ w_{\tilde{N}-1}^{(\gamma)}&\ddots&\ddots&\ddots&w_{1}^{(\gamma)}&w_{0}^{(\gamma)}\\ w_{\tilde{N}}^{(\gamma)}&w_{\tilde{N}-1}^{(\gamma)}&\cdots&\cdots&w_{2}^{(\gamma)}&w_{1}^{(\gamma)}\\ \end{array}\right).

A^{\gamma}_{\tilde{N}}=-\left(\begin{array}[]{cccccc}w_{1}^{(\gamma)}&w_{0}^{(\gamma)}&0&\cdots&0&0\\ w_{2}^{(\gamma)}&w_{1}^{(\gamma)}&w_{0}^{(\gamma)}&\ddots&\ddots&0\\ \vdots&w_{2}^{(\gamma)}&w_{1}^{(\gamma)}&\ddots&\ddots&\vdots\\ \vdots&\ddots&\ddots&\ddots&\ddots&\vdots\\ w_{\tilde{N}-1}^{(\gamma)}&\ddots&\ddots&\ddots&w_{1}^{(\gamma)}&w_{0}^{(\gamma)}\\ w_{\tilde{N}}^{(\gamma)}&w_{\tilde{N}-1}^{(\gamma)}&\cdots&\cdots&w_{2}^{(\gamma)}&w_{1}^{(\gamma)}\\ \end{array}\right).

A_{x}^{(m)} = D_{+}^{(m)} (I_{n_{2}} \otimes A_{n_{1}}^{α}) + D_{-}^{(m)} (I_{n_{2}} \otimes (A_{n_{1}}^{α})^{T}), A_{y}^{(m)} = E_{+}^{(m)} (A_{n_{2}}^{β} \otimes I_{n_{1}}) + E_{-}^{(m)} ((A_{n_{2}}^{β})^{T} \otimes I_{n_{1}}),

A_{x}^{(m)} = D_{+}^{(m)} (I_{n_{2}} \otimes A_{n_{1}}^{α}) + D_{-}^{(m)} (I_{n_{2}} \otimes (A_{n_{1}}^{α})^{T}), A_{y}^{(m)} = E_{+}^{(m)} (A_{n_{2}}^{β} \otimes I_{n_{1}}) + E_{-}^{(m)} ((A_{n_{2}}^{β})^{T} \otimes I_{n_{1}}),

(I_{N} + r A_{x}^{(m)} + s A_{y}^{(m)}) u^{(m)} = (I_{N} - r A_{x}^{(m - 1)} - s A_{y}^{(m - 1)}) u^{(m - 1)} + △ t v^{(m - 1/2)},

(I_{N} + r A_{x}^{(m)} + s A_{y}^{(m)}) u^{(m)} = (I_{N} - r A_{x}^{(m - 1)} - s A_{y}^{(m - 1)}) u^{(m - 1)} + △ t v^{(m - 1/2)},

\bigg{(}\frac{1}{r}I_{N}+A^{(m)}_{x}+\frac{s}{r}A^{(m)}_{y}\bigg{)}\mathbf{u}^{(m)}=\bigg{(}\frac{1}{r}I_{N}-A^{(m-1)}_{x}-\frac{s}{r}A^{(m-1)}_{y}\bigg{)}\mathbf{u}^{(m-1)}+2h_{x}^{\alpha}\mathbf{v}^{(m-1/2)}.

\bigg{(}\frac{1}{r}I_{N}+A^{(m)}_{x}+\frac{s}{r}A^{(m)}_{y}\bigg{)}\mathbf{u}^{(m)}=\bigg{(}\frac{1}{r}I_{N}-A^{(m-1)}_{x}-\frac{s}{r}A^{(m-1)}_{y}\bigg{)}\mathbf{u}^{(m-1)}+2h_{x}^{\alpha}\mathbf{v}^{(m-1/2)}.

M_{(α, β), N}^{(m)}

M_{(α, β), N}^{(m)}

b^{(m)}

M_{(α, β), N}^{(m)} u^{(m)} = b^{(m)} .

M_{(α, β), N}^{(m)} u^{(m)} = b^{(m)} .

f_{k} := \frac{1}{( 2 π ) ^{d}} [- π, π]^{d} \int f (θ) e^{- i < k, θ >} d θ,

f_{k} := \frac{1}{( 2 π ) ^{d}} [- π, π]^{d} \int f (θ) e^{- i < k, θ >} d θ,

T_{N}^{(d)} (f) := [f_{i - j}]_{i, j = 1}^{n} = [\dots [[f_{i_{1} - j_{1}, i_{2} - j_{2}, \dots, i_{d} - j_{d}}]_{i_{d}, j_{d} = 1}^{n_{d}}]_{i_{d - 1}, j_{d - 1} = 1}^{n_{d - 1}} \dots]_{i_{1}, j_{1} = 1}^{n_{1}},

T_{N}^{(d)} (f) := [f_{i - j}]_{i, j = 1}^{n} = [\dots [[f_{i_{1} - j_{1}, i_{2} - j_{2}, \dots, i_{d} - j_{d}}]_{i_{d}, j_{d} = 1}^{n_{d}}]_{i_{d - 1}, j_{d - 1} = 1}^{n_{d - 1}} \dots]_{i_{1}, j_{1} = 1}^{n_{1}},

\displaystyle T^{(2)}_{N}(f)=\bigg{[}\bigg{[}f_{[i_{1}-j_{1},i_{2}-j_{2}]}\bigg{]}_{{}_{i_{1},j_{1}=1}}^{n_{1}}\bigg{]}_{{}_{i_{2},j_{2}=1}}^{n_{2}},

\displaystyle T^{(2)}_{N}(f)=\bigg{[}\bigg{[}f_{[i_{1}-j_{1},i_{2}-j_{2}]}\bigg{]}_{{}_{i_{1},j_{1}=1}}^{n_{1}}\bigg{]}_{{}_{i_{2},j_{2}=1}}^{n_{2}},

T_{N}^{(2)} (f) = ∣ j_{1} ∣ \leq n_{1} \sum ∣ j_{2} ∣ \leq n_{2} \sum f_{[j_{1}, j_{2}]} J_{n_{1}}^{[j_{1}]} \otimes J_{n_{2}}^{[j_{2}]},

T_{N}^{(2)} (f) = ∣ j_{1} ∣ \leq n_{1} \sum ∣ j_{2} ∣ \leq n_{2} \sum f_{[j_{1}, j_{2}]} J_{n_{1}}^{[j_{1}]} \otimes J_{n_{2}}^{[j_{2}]},

T_{N} (f) := T_{N}^{(1)} (f) .

T_{N} (f) := T_{N}^{(1)} (f) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\marginsize

1.5cm1.5cm1cm2.5cm

Spectral analysis and multigrid preconditioners for two-dimensional

space-fractional diffusion equations

Hamid Moghaderi

Department of Applied Mathematics, Faculty of Mathematics and Computer Science, Amirkabir University of Technology, No. 424, Hafez Ave., 15914, Tehran, Iran

Mehdi Dehghan

Department of Applied Mathematics, Faculty of Mathematics and Computer Science, Amirkabir University of Technology, No. 424, Hafez Ave., 15914, Tehran, Iran

Marco Donatelli

Department of Science and High Technology, University of Insubria, Via Valleggio 11, 22100 Como, Italy

Mariarosa Mazza

Division of Numerical Methods for Plasma Physics, Max Planck Institute for Plasma Physics, Boltzmannstrasse 2, 85748 Garching bei München, Germany

Abstract

Fractional diffusion equations (FDEs) are a mathematical tool used for describing some special diffusion phenomena arising in many different applications like porous media and computational finance. In this paper, we focus on a two-dimensional space-FDE problem discretized by means of a second order finite difference scheme obtained as combination of the Crank-Nicolson scheme and the so-called weighted and shifted Grünwald formula.

By fully exploiting the Toeplitz-like structure of the resulting linear system, we provide a detailed spectral analysis of the coefficient matrix at each time step, both in the case of constant and variable diffusion coefficients. Such a spectral analysis has a very crucial role, since it can be used for designing fast and robust iterative solvers. In particular, we employ the obtained spectral information to define a Galerkin multigrid method based on the classical linear interpolation as grid transfer operator and damped-Jacobi as smoother, and to prove the linear convergence rate of the corresponding two-grid method. The theoretical analysis suggests that the proposed grid transfer operator is strong enough for working also with the V-cycle method and the geometric multigrid. On this basis, we introduce two computationally favourable variants of the proposed multigrid method and we use them as preconditioners for Krylov methods. Several numerical results confirm that the resulting preconditioning strategies still keep a linear convergence rate.

**Keywords: **fractional diffusion equations; CN-WSGD scheme; spectral analysis; GLT theory; multigrid methods

††Email addresses: [email protected]; [email protected] (Hamid Moghaderi), [email protected]; [email protected] (Mehdi Dehghan), [email protected] (Marco Donatelli), [email protected] (Mariarosa Mazza)

1 Introduction

Fractional diffusion equations (FDEs) are a special class of partial differential equations (PDEs) used for describing subdiffusion and superdiffusion phenomena. Among their applications: finance, biology, turbulent flow, and image processing [24, 15, 5, 4]. In more detail, a standard diffusion equation becomes a time-FDE when the first order derivative in time is replaced by a fractional derivative in the Caputo form (see e.g., [23] for a precise definition) and/or a space-FDE when the second order derivatives in space are replaced by the fractional Riemann-Liouville derivatives (to be defined below). In this paper, we deal with space-FDEs and we refer to them simply as FDEs.

Closed-form solutions for FDEs are rarely available, then various numerical discretization methods have been proposed in the last decades. Among them, finite difference schemes have been widely studied. For instance, in 2004 and 2006, Meerschaert and Tadjeran proposed a first order accurate finite difference scheme obtained combining an implicit Euler method in time with a first order approximation of the space derivatives called shifted Grünwald formula [18, 19]. In 2006, together with Scheffler the same authors proposed a finite difference discretization method based on the classical Crank-Nicolson (CN) and the Richardson extrapolation, which guarantee the second order accuracy in time and space, respectively [17]. Both methods were proved to be consistent and unconditionally stable. More recently, in [31] the authors, inspired by the shifted Grünwald difference operator and the CN technique, defined a more general and flexible class of second order accurate methods combining the CN discretization in time with a second order approximation of the Riemann-Liouville fractional derivatives called weighted and shifted Grünwald difference (WSGD). Such methods are briefly referred to as CN-WSGD.

From a numerical linear algebra viewpoint, it is worth noticing that, if one of the previously described finite difference schemes is chosen, then the resulting linear system shows a strong structure and indeed the related coefficient matrices can be seen as a sum of diagonal times Toeplitz matrices (see e.g., [36]). As a consequence, the storage requirement is reduced from $\mathcal{O}(N^{2})$ to $\mathcal{O}(N)$ and the complexity of the matrix-vector product from $\mathcal{O}(N^{2})$ to $\mathcal{O}(N\log N)$ , where $N$ is the mesh-size at each time step.

Recently, several fast iterative solvers which exploit the aforementioned Toeplitz-like structure have been proposed for solving linear systems coming from a finite difference discretization of one-dimensional FDEs. Among them, a circulant preconditioning strategy for Conjugate Gradient for Normal Residual method (CGNR) [16], and an efficient multigrid method [21]. The former has been proven to be superlinearly convergent when the diffusion coefficients are constant, the latter has been shown to be optimal when they are also equal. In paper [8], the authors recognize in the Toeplitz-like structure of the FDEs linear systems a Generalized Locally Toeplitz (GLT) sequence and then use the GLT machinery to study its singular value/eigenvalue distribution as the matrix size diverges. The obtained spectral information is employed to propose two competitive tridiagonal structure preserving preconditioners for Krylov methods and to show that the superlinear convergence of the CGNR method with the circulant preconditioner in [16] cannot be replicated by any Krylov method in the nonconstant coefficient case, while the multigrid in [21] is optimal also for nonconstant coefficients under the only hypothesis that they are uniformly bounded and positive.

Of course, fast solvers for multidimensional FDE problems are of crucial interest. In the two-dimensional setting, we mention the ones proposed in [35, 14, 7]. The first reference deals with an alternating-direction finite difference method for two-dimensional FDEs by fully decoupling the linear systems in $x$ -direction and $y$ -direction. In [14] a preconditioner defined by the incomplete LU factorization of a block-banded-banded-block approximation of the coefficient matrix is introduced, and the linear systems are solved by a preconditioned Generalized Minimal Residual (GMRES) method and a preconditioned CGNR method. In [7] the authors propose a Locally One Dimensional (LOD) finite difference method for two-dimensional (and three-dimensional) Riesz FDE problem and use the LOD-multigrid method to solve the resulting linear systems. As for the one-dimensional case, also in the multidimensional setting the spectral analysis of the coefficient matrix is crucial to fully understand the behaviour of preconditioned Krylov and multigrid methods when used for solving the associated linear system. However, for multidimensional FDEs such a spectral study is missing even in the case of constant diffusion coefficients.

In this paper, we focus on a two-dimensional FDE problem discretized by means of the CN-WSGD scheme, and we provide a detailed spectral analysis of the resulting coefficient matrix at each time step, both in the constant and variable diffusion coefficients case. The obtained spectral information suggests that classical preconditioners based on multilevel circulant matrices are not suited for solving the involved linear systems, while multigrid methods can be effective and robust solvers. On this basis, we propose a Galerkin multigrid method with classical linear interpolation as grid transfer operator and damped-Jacobi as smoother, and we prove that the corresponding two-grid method has a linear convergence rate. More precisely, we compute the spectral symbol of the involved matrix-sequences both in the constant and variable coefficients case and use this knowledge to formally prove the convergence and the optimality of the two-grid method, i.e., we prove that the number of iterations required to reach a prescribed accuracy depends neither on the discretization step nor on the order of the fractional derivatives. The analysis of the symbol is also used for defining the parameter of the damped Jacobi such that it satisfies the smoothing property [25].

Concerning the V-cycle, its linear convergence rate has been proven only for sequences of matrices in some trigonometric algebras [2]. In spite of this, we show that the linear interpolation operator reveals powerful enough to work also under some perturbations and then we introduce two computationally favourable variants of the proposed multigrid, especially suited in the variable coefficients case. Indeed, in such a case, the setup phase of a multigrid based on the Galerkin approach is computationally too expensive because the structure of the coefficient matrix cannot be preserved, and this prevents a fast computation of the matrix-vector product. Therefore, we propose two preconditioning alternatives:

a Galerkin multigrid applied to a band preconditioner built using the Laplacian; 2. 2)

a geometric multigrid applied to the full coefficient matrix.

The first strategy is computationally very attractive because it involves matrices with only five nonzero diagonals and hence has a linear cost, with a small constant, in $N$ . However, its effectiveness reduces when both the fractional derivative orders are far from $2$ . On the contrary, the second strategy is reliable for every fractional order, does not need the setup phase and is such that each iteration has a computational cost only about 4/3 times higher than the Jacobi smoother (see [33]). Moreover, it is also well suited as a stand alone solver.

The paper is briefly summarized as follows. In Section 2 we introduce the two-dimensional FDEs equations and recall the CN-WSGD scheme. In Section 3 we firstly describe the notion of symbol and of spectral distribution of a matrix-sequence, as well as the idea behind the GLT. Later, we use these tools to retrieve spectral information on the involved matrices. Such spectral information is then used in Section 4 to study the convergence and the optimality of the proposed multigrid. Finally, Section 5 is devoted to numerical examples and Section 6 contains conclusions and open problems.

2 Problem setting: the CN-WSGD scheme for 2D space-FDEs

In this paper, we are interested in the following initial-boundary value problem of two-dimensional space-FDE

[TABLE]

where $\Omega=(a_{1},b_{1})\times(a_{2},b_{2})$ is the space domain, and $\alpha,\beta\in(1,2)$ are the fractional derivative orders with respect to $x$ and $y$ , respectively. The nonnegative functions $d_{\pm}(x,y,t)$ and $e_{\pm}(x,y,t)$ are the diffusion coefficients and $v(x,y,t)$ is the forcing term. The left-sided $(+)$ and the right-sided $(-)$ fractional derivatives in (2.1) are defined in Riemann-Liouville form as follows

[TABLE]

For the discretization of the FDE problem (2.1) we apply the second order accurate CN-WSGD scheme (see [31]). In order to introduce such a scheme, let us fix three positive integers $M,~{}n_{1},~{}n_{2}$ and discretize the domain $\Omega\times[0,T]$ with

[TABLE]

Let us define

[TABLE]

Using the WSGD formula with shifting parameters $(p,q)=(1,0)$ , we obtain the following expression for the fractional derivatives

[TABLE]

where $u(a_{1},y,t)=u(x,a_{2},t)=u(b_{1},y,t)=u(x,b_{2},t)=0$ , because of the Dirichlet boundary conditions. Here the coefficients $w_{k}^{(\gamma)}$ are given by

[TABLE]

where $g_{k}^{(\gamma)}$ are the alternating fractional binomial coefficients defined as

[TABLE]

with the formal notation $\binom{\gamma}{0}=1$ . For $\gamma=\alpha,\beta$ , the fractional binomial coefficients $g_{k}^{(\gamma)}$ have the following properties

[TABLE]

proved in [18, 19, 21].

Similarly, the coefficients $w_{k}^{(\gamma)}$ satisfy few properties, summarized in the following proposition (see [31]).

Proposition 2.1.

Let $w_{k}^{(\gamma)}$ , $1<\gamma<2$ be defined as in (2.2). Then the coefficients $w_{k}^{(\gamma)}$ satisfy the following properties

[TABLE]

The CN-WSGD scheme is obtained combining the CN scheme in time with the WSGD formula for the fractional derivatives and can be written as follows

[TABLE]

where $u_{i,j}^{(m)}\approx u(x_{i},y_{j},t^{(m)})$ , $v_{i,j}^{(m-1/2)}=v(x_{i},y_{j},t^{(m-1/2)})$ , and

[TABLE]

with

[TABLE]

Now, we can write the matrix form of the above discretization. First, we need to introduce the following notations. Let $N=n_{1}n_{2}$ and define the following objects:

The $N$ -dimensional vectors

[TABLE] 2. 2.

The $N\times N$ diagonal matrices

[TABLE] 3. 3.

The $\tilde{N}\times\tilde{N}$ Toeplitz matrix

[TABLE] 4. 4.

The $N\times N$ matrices

[TABLE]

where $\otimes$ denotes the usual Kronecker product.

Thus the CN-WSGD method (2.4) can be written in the following matrix form

[TABLE]

where $r=\frac{\triangle t}{2h_{x}^{\alpha}}$ , $s=\frac{\triangle t}{2h_{y}^{\beta}}$ , and $I_{N}$ denotes the $N\times N$ identity matrix. Multiplying both sides by $\frac{1}{r}$ , we obtain

[TABLE]

By defining

[TABLE]

the linear system (2.7), which has to be solved at each time step $t^{(m)}$ , can be written as

[TABLE]

Let $\gamma_{0}=(-1+\sqrt{17})/2\approx 1.562$ , then the following results are proved in [31]. If $\gamma\in[\gamma_{0},2)$ , from Proposition 2.1 it follows that $w_{2}^{(\gamma)}\geq 0$ , then the coefficient matrix $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ is a strictly diagonally dominant M-matrix. In addition, when the diffusion coefficients are time independent, the CN-WSGD scheme is unconditionally stable and the truncation error is $\mathcal{O}(h^{2}_{x}+h^{2}_{y}+\triangle t^{2})$ . On the other side, when $d_{+}(x,y,t)=d_{-}(x,y,t)=d$ and $e_{+}(x,y,t)=e_{-}(x,y,t)=e$ , with $d$ and $e$ nonnegative constants, then the matrix $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ is a symmetric positive definite Block-Toeplitz-Toeplitz-Blocks (BTTB) matrix, cf. Definition 3.1.

3 Spectral analysis of the coefficient matrix

Given the notion of symbol and of spectral distribution in the eigenvalue and singular value sense, in this section we provide a spectral analysis of the coefficient matrix-sequence $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ . In the constant coefficient case, as already observed in other papers (see e.g., [16]), the coefficient matrix-sequence is a BTTB sequence: then using well-known spectral tools for BTTB sequences we determine its symbol and study its spectral distribution. In the nonconstant coefficients case, under appropriate conditions, we show that, $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ belongs to the GLT class and use the GLT machinery to analyse its singular value/eigenvalue distribution. The resulting spectral information will be then used in Section 4 for the analysis and the design of numerical solvers to be applied to the considered linear systems.

3.1 Constant diffusion coefficients case

Let us assume that the diffusion coefficients are constant. Under this condition, $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ is a sequence of BTTB matrices, or equivalently of 2-level Toeplitz matrices according to the following definition.

Definition 3.1.

Let $f\in L^{1}([-\pi,\pi]^{d})$ and let $\{f_{\mathbf{k}}\}_{{}_{\mathbf{k}\in\mathbb{Z}^{d}}}$ be the sequence of its Fourier coefficients defined as

[TABLE]

where $\left\langle{\bf k,\bm{\theta}}\right\rangle=\sum_{t=1}^{d}k_{t}\theta_{t}$ . Then the $d$ -level Toeplitz matrix of partial orders $\mathbf{n}=(n_{1},n_{2},\ldots,n_{d})$ associated with $f$ is

[TABLE]

where $N=\prod_{i=1}^{d}n_{i}$ is the order of the matrix. The function $f$ is called the symbol of the matrix-sequence $\{T^{(d)}_{N}(f)\}_{N}$ .

To clarify the notation for the case $d=2$ of interest, the BTTB matrix of order $N$ associated with $f$ is

[TABLE]

or equivalently,

[TABLE]

where $J_{n_{i}}^{[j_{i}]}\in\mathbb{R}^{n_{i}\times n_{i}}$ are matrices whose entry $(s,t)$ -th equals $1$ if $s-t=j_{i}$ and is [math] elsewhere.

When $d=1$ , i.e., for Toeplitz matrices, we simplify the notation using

[TABLE]

Definition 3.2.

The Wiener class is the set of functions

[TABLE]

We determine the sequence of symbols associated to $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ as a corollary of the following proposition.

Proposition 3.3.

Let $\gamma\in(1,2)$ and let $A^{\gamma}_{\tilde{N}}$ be defined as in (2.5). Then the symbol associated to the matrix-sequence $\{A^{\gamma}_{\tilde{N}}\}_{{}_{\tilde{N}\in\mathbb{N}}}$ belongs to the Wiener class and its formal expression is given by

[TABLE]

Proof.

Let us observe that $A^{\gamma}_{\tilde{N}}=[-w^{(\gamma)}_{i-j+1}]_{i,j=1}^{\tilde{N}}$ , with $w^{(\gamma)}_{k}=0$ for $k<0$ , and let us define the function

[TABLE]

Then, by Definition 3.1, it holds that $A^{\gamma}_{\tilde{N}}=T_{\tilde{N}}(f_{\gamma})$ . When $\gamma\in(1,2)$ , it is easy to see that $f_{\gamma}(\xi)$ lies in the Wiener class. In detail, from Proposition 2.1 we know that $w_{1}^{(\gamma)}=\frac{2-\gamma-\gamma^{2}}{2}<0$ , $w_{2}^{(\gamma)}=\frac{\gamma(\gamma^{2}+\gamma-4)}{4}$ , $w_{k}^{(\gamma)}>0$ for $k\geq 0$ and $k\neq 1,2$ , and $w_{k}^{(\gamma)}=0$ for $k<0$ . Then

[TABLE]

Again from Proposition 2.1, we deduce that

[TABLE]

that is

[TABLE]

The righthand side of the previous relation is a positive constant for $\gamma\in(1,2)$ , then we can conclude that $f_{\gamma}(\xi)$ belongs to the Wiener class. An explicit formula for the symbol $f_{\gamma}(\xi)$ can be obtained combining the definition of $w^{(\gamma)}_{k}$ given in (2.2) with the one of $g_{k}^{(\alpha)}$ in (2.3) as follows

[TABLE]

Applying the well-known binomial series

[TABLE]

with $z={\rm e}^{\mathbf{i}(\xi+\pi)}$ , the thesis follows. ∎

Corollary 3.4.

Let $\alpha,\beta\in(1,2)$ and let us assume that $d_{+}(x,y,t)=d^{+}>0,~{}d_{-}(x,y,t)=d^{-}>0,~{}e_{+}(x,y,t)=e^{+}>0,~{}e_{-}(x,y,t)=e^{-}>0$ . Then the matrix $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ defined as in (2.8) is the BTTB matrix $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}=T_{N}^{(2)}(\varphi_{(\alpha,\beta)})$ with

[TABLE]

where $r=\frac{\triangle t}{2h_{x}^{\alpha}}$ and $s=\frac{\triangle t}{2h_{y}^{\beta}}$ .

Proof.

According to Definition 3.1 and thanks to Proposition 3.3, the BTTB terms of $A_{x}^{(m)}$ and $A_{y}^{(m)}$ in equation (2.6) can be written as

[TABLE]

where $\tilde{f}_{\alpha}(\theta_{1},\theta_{2})=f_{\alpha}(\theta_{1})$ and $\tilde{f}_{\beta}(\theta_{1},\theta_{2})=f_{\beta}(\theta_{2})$ .

Finally, recalling that for a real function $f$ it holds $T_{N}^{(d)}(f(\bm{\theta}))^{T}=T_{N}^{(d)}(f(-\bm{\theta}))$ and replacing equations (2.6) in (2.8), we obtain $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}=T_{N}^{(2)}(\varphi_{(\alpha,\beta)})$ with $\varphi_{(\alpha,\beta)}$ defined as in (3.3). ∎

Now we focus our attention on the spectral distribution of $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ , under the further assumption that the diffusion coefficients are equal on both sides. By this hypothesis, $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ is a sequence of symmetric BTTB matrices. Let us start with the definition of the spectral distribution in the sense of the eigenvalues and of the singular values.

Definition 3.5.

Let $f:G\to\mathbb{C}$ be a measurable function, defined on a measurable set $G\subset\mathbb{R}^{k}$ with $k\geq 1$ , $0<m_{k}(G)<\infty$ . Let $\mathcal{C}_{0}(\mathbb{K})$ be the set of continuous functions with compact support over $\mathbb{K}\in\{\mathbb{C},\mathbb{R}_{0}^{+}\}$ and let $\{A_{N}\}$ be a sequence of matrices of size $N$ with eigenvalues $\lambda_{j}(A_{N})$ , $j=1,\ldots,N$ and singular values $\sigma_{j}(A_{N})$ , $j=1,\ldots,N$ .

•

$\{A_{N}\}$ * is distributed as the pair $(f,G)$ in the sense of the eigenvalues, in symbols $\{A_{N}\}\sim_{\lambda}(f,G)$ , if the following limit relation holds for all $F\in\mathcal{C}_{0}(\mathbb{C})$ *

[TABLE]

•

$\{A_{N}\}$ * is distributed as the pair $(f,G)$ in the sense of the singular values, in symbols $\{A_{N}\}\sim_{\sigma}(f,G)$ , if the following limit relation holds for all $F\in\mathcal{C}_{0}(\mathbb{R}_{0}^{+})$ *

[TABLE]

For Hermitian $d$ -level Toeplitz matrix-sequences, the following theorem due to Szegö and Tilli holds (see [11, 32]).

Theorem 3.6.

Let $f\in L^{1}([-\pi,\pi]^{d})$ be a real-valued function, then

[TABLE]

Now, we recall a property of the spectral norm of $d$ -level Toeplitz matrices and we state a relevant theorem contained in [12]. Given a square matrix $A$ of order $N$ , we denote its spectral norm by $\|A\|=\sigma_{1}(A)$ and its trace norm by $\|A\|_{1}=\sum\limits_{i=1}^{N}\sigma_{i}(A)$ . For the spectral norm of a $d$ -level Toeplitz sequence $\{T^{(d)}_{N}(f)\}_{{}_{N\in\mathbb{N}}}$ generated by $f$ it holds (see Corollary 3.5 in [26])

[TABLE]

Theorem 3.7.

(Theorem 3.4 in [12]) Let $\{A_{N}\}_{{}_{N\in\mathbb{N}}}$ be a matrix-sequence with $A_{N}=B_{N}+C_{N}$ and $B_{N}$ Hermitian $\forall N\in\mathbb{N}$ . Assume that

•

$\{B_{N}\}_{{}_{N\in\mathbb{N}}}\sim_{\lambda}(f,G)$ ,

•

$\|B_{N}\|,~{}\|C_{N}\|$ * are bounded by a constant independent of $N$ ,*

•

$\|C_{N}\|_{1}=o(N)$ .

Then $\{A_{N}\}_{{}_{N\in\mathbb{N}}}\sim_{\lambda}(f,G)$ .

The following proposition concerns the eigenvalue distribution of the coefficient matrix-sequence $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ when the diffusion coefficients are constant and equal.

Proposition 3.8.

Let us assume that $d_{\pm}(x,y,t)=d>0,~{}e_{\pm}(x,y,t)=e>0$ , that $\frac{1}{r}=o(1)$ , and $\frac{s}{r}=\frac{h_{x}^{\alpha}}{h_{y}^{\beta}}=\mathcal{O}(1)$ . Let $f_{\gamma}$ be defined as in (3.1) and define

[TABLE]

Given the matrix-sequence $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ , we have

[TABLE]

where

[TABLE]

is a real-valued continuous function and it is nonnegative for $\alpha,\beta\in(1,2)$ .

Proof.

Since the diffusion coefficients $d_{\pm}(x,y,t)$ and $e_{\pm}(x,y,t)$ are constant and equal to real positive numbers $d$ and $e$ , respectively, the matrices of the sequence $\{A^{(m)}_{x}+\frac{s}{r}A^{(m)}_{y}\}_{{}_{N\in\mathbb{N}}}$ (see relation (2.6)) are symmetric. The function

[TABLE]

belongs to the Wiener algebra since $f_{\gamma}(\xi)$ itself is in the same algebra (see Proposition 3.3). Furthermore, from its expression, it also follows that $q_{\gamma}(\xi)$ is real-valued and globally continuous. Hence, $\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\theta_{1},\theta_{2})$ is real-valued and globally continuous. Similarly, the nonnegativity of $\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\theta_{1},\theta_{2})$ follows from the nonnegativity of $q_{\gamma}(\xi)$ which in turn can be easily derived from the expression of $f_{\gamma}(\xi)$ in (3.1) for $\gamma\in(1,2)$ and $\xi\in\{\theta_{1},\theta_{2}\}$ .

From Theorem 3.6 with $d=2$ and since $\frac{s}{r}=\frac{h_{x}^{\alpha}}{h_{y}^{\beta}}=\mathcal{O}(1)$ , it follows that $\{A^{(m)}_{x}+\frac{s}{r}A^{(m)}_{y}\}_{{}_{N\in\mathbb{N}}}\sim_{\lambda}(\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\theta_{1},\theta_{2}),[-\pi,\pi]^{2})$ . Furthermore, using (3.4) with $d=2$ , we have that

[TABLE]

with $C$ independent of $N$ . Moreover, under the hypothesis that $\frac{1}{r}=o(1)$ , the remaining term $\frac{1}{r}I_{N}$ is such that $\|\frac{1}{r}I_{N}\|_{1}=o(N)$ and $\|\frac{1}{r}I_{N}\|=\frac{1}{r}<\tilde{C}$ for some constant $\tilde{C}$ independent of $N$ . By Theorem 3.7, we conclude that the distribution of $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ is decided only by $\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\theta_{1},\theta_{2})$ . ∎

Let us recall that both $q_{\alpha}(\theta_{1})$ and $q_{\beta}(\theta_{2})$ are nonnegative functions. Moreover, as a straightforward consequence of Proposition 4 in [8], it is easy to prove that $q_{\alpha}(\theta_{1})$ and $q_{\beta}(\theta_{2})$ have a zero at [math] of order $\alpha$ and $\beta$ , respectively. With this result in mind, in the following proposition we prove that the superior limit of $\frac{\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\theta_{1},\theta_{2})}{|\bm{\theta}|^{\gamma}}$ , with $\gamma=\min\{\alpha,\beta\}$ , is bounded as $(\theta_{1},\theta_{2})\to(0,0)$ . Such a proposition will be used in Section 4 for proving the constant converge rate of the two-grid and the V-cycle.

Proposition 3.9.

Let $\alpha,\beta\in(1,2)$ , with $\alpha\neq\beta$ , and let $\gamma=\min\{\alpha,\beta\}$ , then there exist two real constants $C_{1},C_{2}>0$ such that

[TABLE]

*where $\mathbf{0}=(0,0)$ and $\bm{\theta}=(\theta_{1},\theta_{2})$ . *

Proof.

Let us rewrite $1+{\rm e}^{{\bf i}(\theta_{k}+\pi)}$ and $1-{\rm e}^{-{\bf i}\theta_{k}}$ in polar form

[TABLE]

where

[TABLE]

for $k=1,2$ . According to the polar form we have that

[TABLE]

Moreover, we have that

[TABLE]

and

[TABLE]

Note that if $\alpha-\gamma\geq 0$ and $\beta-\gamma\geq 0$ , i.e., $\gamma=\min\{\alpha,\beta\}$ , then both limits (3.10) and (3.11) are finite. Moreover, since

[TABLE]

from relation (3.9) it holds that

if $\gamma=\alpha$ , then

[TABLE] 2. 2.

if $\gamma=\beta$ , then

[TABLE]

It is to convince the reader that $\lim\limits_{\bm{\theta}\to\bm{0}}\frac{\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\theta_{1},\theta_{2})}{|\bm{\theta}|^{\gamma}}$ does not exists. Indeed, if w.l.o.g. we fix $\gamma=\alpha<\beta$ , then along the lines $\theta_{1}=0$ and $\theta_{2}=0$ we get

[TABLE]

where $C$ is a positive constant. The equalities $(\triangle)$ are due to the fact that $q_{\alpha}(\theta_{1})$ and $q_{\beta}(\theta_{2})$ have a zero at [math] of order $\alpha$ and $\beta$ , respectively, with $\alpha<\beta$ by hypothesis. Therefore, it yields that

[TABLE]

and this, observing that $\liminf\limits_{\bm{\theta}\to\bm{0}}\frac{\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\theta_{1},\theta_{2})}{|\bm{\theta}|^{\gamma}}$ is nonnegative, completes the proof. ∎

3.2 Nonconstant diffusion coefficients case

Now we focus on the symbol associated to $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ and on its spectral distribution, when $d_{+}(x,y,t)$ , $d_{-}(x,y,t)$ , $e_{+}(x,y,t)$ and $e_{-}(x,y,t)$ are nonconstant functions. For this purpose we need the notion of GLT class and the related theory, starting from the pioneering work by Tilli [30] and widely generalized in [28]. In short, the GLT class is an algebra virtually containing any sequence of matrices coming from “reasonable” approximations by local discretization methods (finite differences, finite elements, isogeometric analysis, etc) of partial differential equations (see [10]), containing the multilevel Toeplitz sequences with Lebesgue integrable generating functions. The formal definition is rather technical and involves a heavy notation: therefore we just give and briefly discuss the notion in two dimensions, which is the case of interest in our setting, and we report few properties of the GLT class [10] , which are sufficient for studying the spectral features of the matrices $\left\{{\cal M}^{(m)}_{(\alpha,\beta),N}\right\}_{N\in\mathbb{N}}$

Since a GLT sequence is a sequence of matrices obtained from a combination of some algebraic operations on multilevel Toeplitz matrices and diagonal sampling matrices, we need the following definition.

Definition 3.10.

Given a Riemann-integrable function $a:[0,1]^{2}\rightarrow\mathbb{C}$ , by diagonal sampling matrix of order $N=n_{1}n_{2}$ we mean $D_{N}(a)={\rm diag}_{{}_{j_{1}=1,\dots,n_{1}\atop j_{2}=1,\dots,n_{2}}}a\bigg{(}\frac{j_{1}}{n_{1}},\frac{j_{2}}{n_{2}}\bigg{)}$ .

Throughout, we use the following notation

[TABLE]

to say that the sequence $\{A_{N}\}_{N\in\mathbb{N}}$ is a GLT sequence with symbol $\kappa(\mathbf{x},\bm{\theta})$ .

Here we report four main features of the GLT class in two dimensions.

GLT1

Let $\{A_{N}\}_{N\in\mathbb{N}}\sim_{GLT}\kappa(\mathbf{x},\bm{\theta})$ with $\kappa:G\rightarrow\mathbb{C}$ , $G=[0,1]^{2}\times[-\pi,\pi]^{2}$ , then $\{A_{N}\}_{N\in\mathbb{N}}\sim_{\sigma}(\kappa,G)$ . If the matrices $A_{N}$ are Hermitian, then it holds also $\{A_{N}\}_{N\in\mathbb{N}}\sim_{\lambda}(\kappa,G)$ .

GLT2

The set of GLT sequences form a $\ast$ -algebra, i.e., it is closed under linear combinations, products, inversion (whenever the symbol vanishes, at most, in a set of zero Lebesgue measure), conjugation: hence, the sequence obtained via algebraic operations on a finite set of input GLT sequences is still a GLT sequence and its symbol is obtained by following the same algebraic manipulations on the corresponding symbols of the input GLT sequences.

GLT3

Every BTTB sequence $\{T^{(2)}_{N}(f)\}_{N\in\mathbb{N}}$ generated by an $L^{1}([-\pi,\pi]^{2})$ function $f(\bm{\theta})$ is such that $\{T^{(2)}_{N}(f)\}_{N\in\mathbb{N}}\sim_{GLT}f(\bm{\theta})$ , with the specifications reported in item $\mathbf{[GLT1]}$ . Every diagonal sampling sequence $\{D_{N}(a)\}_{N\in\mathbb{N}}$ , where $a(\mathbf{x})$ is a Riemann integrable function, is such that $\{D_{N}(a)\}_{N\in\mathbb{N}}\sim_{GLT}a(\mathbf{x})$ .

GLT4

Let $\{A_{N}\}_{N\in\mathbb{N}}\sim_{\sigma}(0,G)$ , $G=[0,1]^{2}\times[-\pi,\pi]^{2}$ , then $\{A_{N}\}_{N\in\mathbb{N}}\sim_{GLT}0$ .

Proposition 3.11.

Let us assume that $\frac{1}{r}=o(1)$ and $\frac{s}{r}=\frac{h_{x}^{\alpha}}{h_{y}^{\beta}}=\mathcal{O}(1)$ and that, fixed the instant of time $t_{m}$ , $d_{+}(x,y):=d_{+}(x,y,t_{m})$ , $d_{-}(x,y):=d_{-}(x,y,t_{m})$ , $e_{+}(x,y):=e_{+}(x,y,t_{m})$ and $e_{-}(x,y):=e_{-}(x,y,t_{m})$ are Riemann integrable over $[a_{1},b_{1}]\times[a_{2},b_{2}]$ . For the matrix $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ it holds

[TABLE]

with

[TABLE]

where

[TABLE]

and $(\hat{\mathbf{x}},\bm{\theta})\in[0,1]^{2}\times[-\pi,\pi]^{2},~{}(\mathbf{x},\bm{\theta})\in[a_{1},b_{1}]\times[a_{2},b_{2}]\times[-\pi,\pi]^{2}$ . Furthermore,

[TABLE]

and whenever $d_{+}(x,y)=d_{-}(x,y)=e_{+}(x,y)=e_{-}(x,y)$ , we also have

[TABLE]

with $h_{(\alpha,\beta)}(\mathbf{x},\bm{\theta})$ real-valued and indeed all the matrices $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ have only real eigenvalues.

Proof.

Let us observe that, fixed the instant of time $t_{m}$ , the diagonal elements of the matrices $D^{(m)}_{\pm}$ and $E^{(m)}_{\pm}$ are a uniform sampling of the functions $d_{\pm}(\mathbf{x})$ , and $e_{\pm}(\mathbf{x})$ , respectively, with $\mathbf{x}=(x,y)\in[a_{1},b_{1}]\times[a_{2},b_{2}]$ . Therefore, thanks to [GLT3] and to the Riemann integrability of the diffusion coefficients it yields

[TABLE]

Since the GLT class is stable under linear combinations and products [GLT2] and since BTTB sequences with $L^{1}([-\pi,\pi]^{2})$ symbols lie in the GLT class [GLT3], it is immediate to see that, under the hypothesis that $\frac{s}{r}=\frac{h_{x}^{\alpha}}{h_{y}^{\beta}}=\mathcal{O}(1)$ , the matrix-sequence $\{A^{(m)}_{x}+\frac{s}{r}A^{(m)}_{y}\}_{{}_{N\in\mathbb{N}}}$ is still a member of the GLT class and

[TABLE]

where $g_{\alpha}$ , $g_{\beta}$ are defined as in (3.14) and $(\hat{\mathbf{x}},\bm{\theta})\in[0,1]^{2}\times[-\pi,\pi]^{2}$ . Moreover, the sequence $\{\frac{1}{r}I_{N}\}_{{}_{N\in\mathbb{N}}}$ (under the hypothesis that $\frac{1}{r}=o(1)$ ) is a GLT sequence with zero symbol, item [GLT4]. This together with [GLT2] and (3.15) implies that $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}\sim_{GLT}\hat{h}_{(\alpha,\beta)}(\hat{\mathbf{x}},\bm{\theta})$ . Then by [GLT1] we can conclude $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}\sim_{\sigma}(\hat{h}_{(\alpha,\beta)}(\hat{\mathbf{x}},\bm{\theta}),[0,1]^{2}\times[-\pi,\pi]^{2})$ and hence $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}\sim_{\sigma}(h_{(\alpha,\beta)}(\mathbf{x},\bm{\theta}),[a_{1},b_{1}]\times[a_{2},b_{2}]\times[-\pi,\pi]^{2})$ , after an affine change of variable.

Now, by exploiting Proposition 3.3 and Proposition 3.8, since $\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\bm{\theta})$ is real-valued, it is clear that if $d_{+}(\mathbf{x})=d_{-}(\mathbf{x})=e_{+}(\mathbf{x})=e_{-}(\mathbf{x})$ then $h_{(\alpha,\beta)}(\mathbf{x},\bm{\theta})$ is real-valued. Furthermore, under the hypothesis $d_{+}(\mathbf{x})=d_{-}(\mathbf{x})=e_{+}(\mathbf{x})=e_{-}(\mathbf{x})$ we deduce that $D^{(m)}_{+}=D^{(m)}_{-}=E^{(m)}_{+}=E^{(m)}_{-}$ is a positive definite diagonal block matrix, and choosing $G$ as the positive definite square root of $D^{(m)}_{+}$ , we find that $G^{-1}\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}G$ is similar to $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ and real symmetric. Therefore, all the eigenvalues of $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ are real and we plainly have $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}\sim_{\lambda}(h_{(\alpha,\beta)}(\mathbf{x},\bm{\theta}),[a_{1},b_{1}]\times[a_{2},b_{2}]\times[-\pi,\pi]^{2})$ , by exploiting again the GLT machinery, as done before but in the Hermitian setting. ∎

Now, let us assume that all the diffusion coefficients are uniformly bounded and positive. Under this hypothesis, the following proposition can be seen as an extension to the nonconstant coefficients case of the result for constant coefficients shown in Proposition 3.9.

Proposition 3.12.

Let us assume that $\alpha\neq\beta$ . Given $\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\bm{\theta})$ as in (3.5) and $h_{(\alpha,\beta)}(\mathbf{x},\bm{\theta})$ as in (3.11), the following limit relation holds

[TABLE]

where $\gamma=\min\{\alpha,\beta\}$ , $\mathbf{x}=(x,y)$ , $\mathbf{0}=(0,0)$ and $\bm{\theta}=(\theta_{1},\theta_{2})$ .

Proof.

As in the proof of Proposition 3.9, we exploit the polar form of $1+e^{\mathbf{i}(\theta_{k}+\pi)}$ and $1-e^{-\mathbf{i}\theta_{k}}$ , for $k=1,2$ , and rewrite $h_{(\alpha,\beta)}(\mathbf{x},\bm{\theta})$ and $\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\bm{\theta})$ as follows

[TABLE]

According to the definition of $\phi_{k}$ in (3.8), it is easy to see that for $\gamma\in(1,2)$ and $k=1,2$ , we have

[TABLE]

Combining relations (3.2) and (3.17), we obtain

[TABLE]

Now, we calculate the limit of the quotient $\frac{h_{(\alpha,\beta)}(\mathbf{x},\bm{\theta})}{\mathbf{\mathcal{F}}_{(\alpha,\beta)}(\bm{\theta})}$ . If $\gamma=\alpha$ , we have

[TABLE]

and then thanks to relations (3.18), (3.2), and (3.20), we conclude that

[TABLE]

If $\gamma=\beta$ , we obtain an analogous result with $\beta$ , $e_{\pm}$ , $\theta_{2}$ in place of $\alpha$ , $d_{\pm}$ , $\theta_{1}$ , respectively, and the thesis is proved. ∎

4 Multigrid methods

Multigrid methods have shown to be a valid alternative to preconditioned Krylov methods also for FDEs [21]. Using the Ruge-Stüben theory [25], Theorem 4 in [8] and in [21] for one-dimensional case shows that, in the constant and varying coefficients cases, i.e., $d_{\pm}(x,t)=d>0$ and $d_{+}(x,t)=d_{-}(x,t)>0$ , respectively, the two-grid method converges with a linear convergence rate independent of $N$ and $m$ . Since in this case the matrix $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ is a BTTB matrix, the classical multigrid theory for BTTB matrices developed in [9, 6, 27, 2] can be directly applied when the symbol is known. Under the assumptions that $d_{\pm}(x,y,t)=d>0$ , $e_{\pm}(x,y,t)=e>0$ , $\frac{1}{r}=o(1)$ and $\frac{s}{r}=\frac{h^{\alpha}_{x}}{h^{\beta}_{y}}=\mathcal{O}(1)$ , according to our previous analysis in Subsection 3.1, the symbol of the BTTB sequence $\{\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}\}_{{}_{N\in\mathbb{N}}}$ is $d\,q_{\alpha}(\theta_{1})+\frac{s}{r}e\,q_{\beta}(\theta_{2})$ (cf. Proposition 3.8).

Consider the stationary iterative method

[TABLE]

for the solution of the linear system $A_{N}x=b$ where $A_{N},W_{N},S_{N}:=I-W_{N}^{-1}A_{N}\in\mathbb{C}^{N\times N}$ , and $b,b_{1}:=W_{N}^{-1}b\in\mathbb{C}^{N}$ . Given a full-rank matrix $P_{N}\in\mathbb{C}^{N\times k}$ , with $k<N$ , a Two-Grid Method (TGM) is defined by the following algorithm [13].

[TABLE]

Step 6) consists in applying the “post-smoothing iteration” (4.1) $\nu$ times while steps 1) $\rightarrow$ 5) define the “coarse grid correction” which depends on the grid transfer operator $P_{N}$ . The iteration matrix of TGM is then given by

[TABLE]

It could be possible also to add a “pre-smoothing” iteration in the TGM algorithm before the step 1). We do not add it here for keeping the theoretical analysis as simple as possible. Nevertheless, the addition of a convergent pre-smoother, obviously, cannot deteriorate the convergence of the algorithm but only accelerate the convergence.

Here we want to give a formal proof of convergence and of the optimality of TGM. We recall some convergence results in [25] from the theory of the algebraic multigrid method which are the main theoretical tools for giving our convergence proof. By $\|\cdot\|_{2}$ we denote the Euclidean norm and whenever $X$ is positive definite, $\|\cdot\|_{X}=\|X^{1/2}\cdot\|_{2}$ denotes the energy norm of $X$ . Finally if $X$ and $Y$ are Hermitian matrices, then $X\leq Y$ is equivalent to write that $Y-X$ is nonnegative definite.

Theorem 4.1.

[25]** Let $A_{N}$ be a positive definite matrix of size $N$ and let $S_{N}$ be defined as in the TGM algorithm. Suppose that $\exists\;\delta>0$ independent of $N$ such that

[TABLE]

where $D_{N}$ is the diagonal matrix having the same diagonal of $A_{N}$ . Assume that $\exists\,\xi>0$ independent of $N$ such that

[TABLE]

Then $\xi\geq\delta$ and

[TABLE]

When $N$ is huge also the coarser problem can be too large to be solved directly. Hence, the point 4) in the TGM should be replaced by a recursive application of the same strategy until we reach a sufficiently small size. A reliable initial guess for the coarser linear system is the zero vector. Indeed, at the coarser levels the linear system to be solved in the error equation and hence the solution goes to zero whenever the whole iteration converges. The resulting algorithm is usually referred as V-cycle.

4.1 Two-grid convergence analysis

The convergence analysis of the TGM is firstly developed in the constant coefficient case, i.e., in the case $A_{N}=T_{N}^{(2)}(\mathbf{\mathcal{F}}_{(\alpha,\beta)})$ . The variable coefficient case will be discussed at the end of the subsection.

Let us start showing that damped Jacobi, with a proper choice of the relaxation parameter, satisfies the smoothing property (4.2).

Lemma 4.2.

*Let $A_{N}:=T^{(2)}_{N}(\mathbf{\mathcal{F}}_{(\alpha,\beta)})$ with $\mathbf{\mathcal{F}}_{(\alpha,\beta)}$ defined as in (3.5) and let $S_{N}:=I_{N}-\omega D^{-1}_{N}A_{N}$ , where $\omega$ is a real number and $D_{N}=a_{0}I_{N}$ with $a_{0}$ the Fourier coefficient of $\mathbf{\mathcal{F}}_{(\alpha,\beta)}$ of order zero. Moreover, let us assume that $\frac{1}{r}=\frac{2h_{x}^{\alpha}}{\triangle t}=o(1)$ and $\frac{s}{r}=\frac{h_{x}^{\alpha}}{h_{y}^{\beta}}=\mathcal{O}(1)$ . If we choose $0<\omega<\frac{2a_{0}}{\|\mathbf{\mathcal{F}}_{(\alpha,\beta)}\|_{\infty}}$ , then $\exists\;\delta>0$ such that inequality (4.2) holds true. *

Proof.

By recalling that $\mathbf{\mathcal{F}}_{(\alpha,\beta)}$ is nonnegative, it follows that $a_{0}=\frac{\|\mathbf{\mathcal{F}}_{(\alpha,\beta)}\|_{1}}{2\pi}$ . Hence the relation given in (4.2) is equivalent to the inequality

[TABLE]

which can be rewritten as

[TABLE]

using a congruence transformation with $A^{-\frac{1}{2}}$ . Since $D_{N}=a_{0}I_{N}$ , where $a_{0}=\frac{1}{r}-2(d\,w^{(\alpha)}_{1}+\frac{s}{r}e\,w^{(\beta)}_{1})$ and $a_{0}>0$ (cf. (2.1)), under the hypothesis that $\frac{1}{r}=\frac{2h_{x}^{\alpha}}{\triangle t}=o(1)$ and $\frac{s}{r}=\frac{h_{x}^{\alpha}}{h_{y}^{\beta}}=\mathcal{O}(1)$ , we have that $a_{0}$ is independent of $N$ . The inequality (4.4) reads as

[TABLE]

which is implied by the function inequality $(1-\frac{\omega}{a_{0}}\mathbf{\mathcal{F}}_{(\alpha,\beta)})^{2}\leq 1-(\frac{\delta}{a_{0}})\mathbf{\mathcal{F}}_{(\alpha,\beta)}$ . From the latter inequality, if $0<\omega<\frac{2a_{0}}{\|\mathbf{\mathcal{F}}_{(\alpha,\beta)}\|_{\infty}}$ then it exists $\delta>0$ such that equation (4.2) holds and the thesis is proved (see Proposition 3 in [] for more details). ∎

In order to prove the approximation property (4.3), we define the projector $P_{N}$ as

[TABLE]

where the function $p$ will be defined later and $U^{k}_{N}=K^{k_{1}}_{n_{1}}\otimes K^{k_{2}}_{n_{2}}$ is the bidimensional down-sampling operator. Let $k_{\ell}=(n_{\ell}-(n_{\ell}\,\mathrm{mod}\,2))/2$ with $\ell=1,2$ , the one-dimensional down-sampling matrix $K^{k}_{n}\in\mathbb{R}^{n\times k}$ is defined as

[TABLE]

Lemma 4.4 gives the theoretical conditions that $p$ has to meet in order to satisfy (4.3). A preliminary proposition is necessary to extend the theoretical analysis usually done matrix algebras, like circulant or $\tau$ matrices (see e.g. [2]) also to (multilevel) Toeplitz matrices.

Proposition 4.3.

[29]** Given a matrix $A$ , let us denote by $\rm diag(A)$ the diagonal matrix having the same diagonal of $A$ . Moreover, let us assume that the property (4.3) is fulfilled by a matrix-sequence $\{A_{N}\}_{N}$ . If there exists another matrix-sequence $\{B_{N}\}_{N}$ such that

[TABLE]

$\eta,\vartheta>0$ , then property (4.3) is fulfilled by $\{B_{N}\}_{N}$ with the same $P_{N}$ as well.

Lemma 4.4.

Let $A_{N}:=T^{(2)}_{N}(\mathbf{\mathcal{F}}_{(\alpha,\beta)})$ with $\mathbf{\mathcal{F}}_{(\alpha,\beta)}$ defined as in (3.5) and $n_{\ell}=2k_{\ell}-1$ for $\ell=1,2$ . Moreover, let $P_{N}=T^{(2)}_{N}(p)U^{k}_{N}$ with $p$ the following polynomial

[TABLE]

and assume that $\frac{1}{r}=\frac{2h_{x}^{\alpha}}{\triangle t}=o(1)$ and $\frac{s}{r}=\frac{h_{x}^{\alpha}}{h_{y}^{\beta}}=\mathcal{O}(1)$ . Then there exists $\xi>0$ such that relation (4.3) holds true.

Proof.

The proof combines classical results on multigrid methods for Toeplitz matrices with the spectral results in Section 3. According to Proposition 3.8, the function $\mathbf{\mathcal{F}}_{(\alpha,\beta)}$ is nonnegative and vanishes only at the origin. Moreover, thanks to Proposition 3.9, the polynomial $p$ in (4.7) satisfies the classical condition

[TABLE]

with $M(\mathbf{x})$ being the set of the “mirror points” of $\mathbf{x}$ defined as $M(\mathbf{x})=\{(x_{1},\pi-x_{2}),(\pi-x_{1},x_{2}),(\pi-x_{1},\pi-x_{2})\}$ , see [27] for the derivation of condition (4.8). In particular, the condition (4.8) holds with $c=0$ . Therefore, the TGM defined in the two-level $\tau$ algebra would be convergent.

The result for BTTB matrices follows from Proposition 4.3. Note that $T^{(2)}_{N}(p)$ is a two-level $\tau$ matrix and $n_{\ell}=2k_{\ell}-1$ for $\ell=1,2$ . Hence the projector $P_{N}$ is exactly the same projector used in TGM defined in the two-levels $\tau$ algebra, cf. [27]. Finally, Proposition 4.3 can be applied replacing the sequences $\{A_{N}\}_{N}$ and $\{B_{N}\}_{N}$ with the sequences of two-levels $\tau$ and BTTB matrices generated by $\mathbf{\mathcal{F}}_{(\alpha,\beta)}$ , respectively. Indeed the condition (4.5) follows from Theorem 7.1 in [27], while the condition (4.6) is a trivial consequence of Proposition 2.1 with $\eta=1$ . ∎

Remark 4.5.

The grid transfer operator associated to the symbol $p$ defined as in (4.7) is, up to a constant factor, the classical bilinear interpolation.

By Lemmas 4.2 and 4.4, it follows that there exist $\delta$ and $\xi$ such that inequalities (4.2) and (4.3) hold, respectively. Therefore, by Theorem 4.1, $\xi\geq\delta$ and $\|TGM(S_{N},P_{N})\|_{A_{N}}\leq\sqrt{1-\delta/\xi}$ . The result can be summarized as follows.

Theorem 4.6.

Let $A_{N}:=T^{(2)}_{N}(\mathbf{\mathcal{F}}_{(\alpha,\beta)})$ with $\mathbf{\mathcal{F}}_{(\alpha,\beta)}$ defined as in (3.5).

•

Let $S_{N}:=I_{N}-\omega D^{-1}_{N}A_{N}$ , where $0<\omega<\frac{2a_{0}}{\|\mathbf{\mathcal{F}}_{(\alpha,\beta)}\|_{\infty}}$ and $D_{N}=a_{0}I_{N}$ with $a_{0}$ the Fourier coefficient of $\mathbf{\mathcal{F}}_{(\alpha,\beta)}$ of order zero (Jacobi smoother).

•

Let $P_{N}=T^{(2)}_{N}(p)U^{k}_{N}$ with $p$ defined as in (4.7) (bilinear interpolation).

Moreover, let us assume that $\frac{1}{r}=\frac{2h_{x}^{\alpha}}{\triangle t}=o(1)$ and $\frac{s}{r}=\frac{h_{x}^{\alpha}}{h_{y}^{\beta}}=\mathcal{O}(1)$ . Then the assumptions of Theorem 4.1 are satisfied and it holds

[TABLE]

with $c$ a constant independent of $N,\alpha$ , and $\beta$ .

The variable coefficients case can be addressed thanks to the extension of the previous results given in [27]. Let $d_{\pm}$ and $e_{\pm}$ be four uniformly bounded and positive functions. Then the linear convergence rate of the two-grid method is preserved combining Proposition 3.12 with Lemma 6.2 in [27].

4.2 V-cycle and geometric mulgrid

The convergence analysis of the V-cycle is much more involved and a linear convergence rate has been proven, under a condition stricter than (4.8), only for sequences of matrices in some trigonometric algebras, like the $\tau$ algebra used in the proof of Lemma 4.4, see [2]. In details, the symbol $p$ of the grid transfer operator has to satisfy

[TABLE]

With respect to (4.8), the numerator does not have the power two and hence $p$ has to vanish at the mirror points with double order.

Remark 4.7.

Similarly to Lemma 4.4, for $p$ defined as in (4.7) the condition (4.9) holds true with $c=0$ .

The previous remark suggests that the bilinear interpolation is powerful enough to work also under some perturbations. In particular, we could use the geometric multigrid instead of the Galerkin approach. The geometric multigrid defines the coarser matrix $A_{k}$ rediscretizing the original differential operator on the coarser grid instead of computing $A_{k}$ by $A_{k}=P_{N}^{T}A_{N}P_{N}$ . This saves some computational costs with respect to Galerkin approach which requires the computation of the coarser matrices at each recursion level in a precomputing phase. In particular, in the variable coefficient case the coarser matrices lose the GLT structure in terms of diagonal and BTTB matrices. Therefore we should memorize all the coefficients of the two-level lower Hessenberg coefficient matrices and the computational cost of the precomputing phase is much higher than the $\mathcal{O}(N\log N)$ arithmetic operations required for the matrix-vector product with the matrix $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ .

In conclusion, the Galerkin approach is useful for the theoretical analysis in Subsection 4.1, but it is not computationally feasible. Therefore, we implement the geometric approach and we denote by $\mathbf{\mathcal{P}}^{(m)}_{{\rm MGM},N}$ one iteration of the geometric multigrid algorithm applied to the linear system with coefficient matrix ${\cal M}_{(\alpha,\beta),N}^{(m)}$ and defined by the following setting:

•

V-cycle with a number of recursion levels depending on the size $N$ (coarsest grid fixed to $8\times 8$ ).

•

The smoother is one step of pre- and post-smoother for the classical Jacobi method (according to Lemma 4.2).

•

The grid transfer operator is the standard bilinear interpolation and full-weighting corresponding to a proper scaling of the symbol $p$ defined as in (4.7) (according to Lemma 4.4).

4.3 Preconditioning

According to the theoretical analysis in Subsection 4.1 and the discussion at the end of Subsection 4.2, our geometric multigrid can be effectively used as a stand alone solver. Nevertheless, in order to increase their robustness, multigrid methods are often applied as preconditioners for Krylov methods. Moreover, the application of multigrid methods as preconditioners allows a simple comparison with existing strategies based on (inverse) band or circulant preconditioners, see e.g., [14, 16, 34].

A further possibility is the combination of a band preconditioner with a multigrid method based on the Galerkin approach. The resulting strategy keeps the computational cost of the precomputing phase to $\mathcal{O}(N)$ . A simple band preconditioner suitable for Galerkin multigrid methods is the Laplacian preconditioner ( $\alpha=\beta=2$ ). Such preconditioner is inspired by the 1D analysis in [8], where the use of the Laplacian was introduced for fractional derivatives of order greater than 1.5. In this paper, independently of the values of $\alpha$ and $\beta$ , we propose the preconditioner

[TABLE]

where $A^{2}_{n_{j}}$ , for $j=1,2$ is the Laplacian matrix. The subscript $2$ in $\mathbf{\mathcal{P}}^{(m)}_{2,N}$ recall the order of the derivative used in the preconditioner. Note that the matrix $\mathbf{\mathcal{P}}^{(m)}_{2,N}$ has only five nonzero diagonals. On the other hand, due to fill-in, the direct solution of the linear system with coefficient matrix $\mathbf{\mathcal{P}}^{(m)}_{2,N}$ is not feasible. Therefore, instead of solving the corresponding linear system, we apply only one step of our multigrid consisting of the standard Galerkin approach with Jacobi smoothing and bilinear interpolation as grid transfer operator. Note that Theorem 4.6 holds also with $\alpha=\beta=2$ and the resulting preconditioner has a computational cost linear in $N$ .

The goodness of this preconditioner depends on the values of $\alpha$ and $\beta$ . To define a preconditioner with the same structure of the matrix $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ and keeping at the same time a small bandwidth, the symbol of a banded BTTB matrix has to be a trigonometric polynomial and hence the zero of the symbol cannot be of fractional order. Nevertheless, the condition number of the preconditioned matrix $(\mathbf{\mathcal{P}}^{(m)}_{2,N})^{-1}{\cal M}_{(\alpha,\beta),N}^{(m)}$ is asymptotical to $N^{\frac{2-\gamma}{2}}$ , with $\gamma=\max\{\alpha,\beta\}$ s.t. $0<2-\gamma<1$ , and hence the number of iterations of a conjugate gradient type method grows as $\mathcal{O}(N^{\frac{2-\gamma}{4}})$ , see [3]. In conclusion the preconditioner $\mathbf{\mathcal{P}}^{(m)}_{2,N}$ is a good choice whenever $\alpha$ or $\beta$ are close to $2$ , as confirmed by the numerical results in the next section.

5 Numerical results

In this section we test the effectiveness of our new multigrid preconditioners $\mathbf{\mathcal{P}}^{(m)}_{2,N}$ and $\mathbf{\mathcal{P}}^{(m)}_{{\rm MGM},N}$ defined in the previous section. Since two of the three proposed examples are taken from [14] we compare the performances of our preconditioners with the proposal therein (same Krylov method and same tolerance are used).

In the following tables, according to the notation defined in Section 1, $n_{1}$ and $n_{2}$ denote the numbers of spatial partitions in $x$ -direction and $y$ -direction, respectively, while $M$ denotes the number of time steps. In all our examples we simply fix $n_{1}=n_{2}=M$ . The infinite norm of the difference between the exact solution and the numerical solution at the last time step is denoted by “ $Error$ ”. The number of iterations is computed as the arithmetic average of the number of iterations required for solving (2.9) at each time step $t^{(m)}$ . For this reason our multigrid preconditioners will be simply denoted by $\mathbf{\mathcal{P}}_{2}$ and $\mathbf{\mathcal{P}}_{{\rm MGM}}$ . We use the GMRES method to solve the discretized linear systems (2.9). The GMRES method is computationally performed using the built-in gmres Matlab function with tolerance $10^{-7}$ with a restarting every 20 iterations, even if for the preconditioned iterations it is not necessary. The initial guess at each time step is chosen as the zero vector. Our computations are performed using Matlab 7.10 software on a Pentium IV, 3.50 GHz CPU machine with 4 Gbyte of memory.

5.1 Example 1

The first example is taken from [22]. We consider a FDE of type (2.1) with $\alpha=1.8,~{}\beta=1.6$ . The nonconstant diffusion coefficients are given by

[TABLE]

The spatial domain is $\Omega=[0,2]\times[0,2]$ , while the time interval is $[0,T]=[0,1]$ . The initial condition is

[TABLE]

and the source term is such that the solution to the FDE is given by $u(x,y,t)=16{\rm e}^{-t}x^{2}(2-x)^{2}y^{2}(2-y)^{2}$ .

Let $h=h_{x}=h_{y}=2/(n+1)$ , with $n=n_{1}=n_{2}=M$ . Therefore, we have

[TABLE]

Since $\alpha=1.8$ and $\beta=1.6$ we obtain $\frac{1}{r}=O(n^{-0.8})$ and $\frac{s}{r}=O(n^{-0.2})$ . Then when $n\rightarrow\infty$ , both $\frac{1}{r}$ and $\frac{s}{r}$ tend to zero. As a consequence, the term in $\beta$ goes slowly to zero and for large $n$ the term in $\alpha$ becomes dominant.

Table 1 compares the iterations and the accuracy of numerical solutions provided by the GMRES without preconditioning and by the GMRES preconditioned with our proposals. For comparison we report also the number of iterations required by the “exact” preconditioners $\widetilde{\mathbf{\mathcal{P}}}_{2}$ and $\widetilde{\mathbf{\mathcal{P}}}_{{\rm MGM}}$ , where $\widetilde{\mathbf{\mathcal{P}}}_{2}$ denotes the preconditioner $\mathbf{\mathcal{P}}_{2}$ with the direct solution of the resulting linear system and $\widetilde{\mathbf{\mathcal{P}}}_{{\rm MGM}}$ denotes the proposed multigrid applied to the matrix $\mathbf{\mathcal{M}}^{(m)}_{(\alpha,\beta),N}$ , but using the Galerkin approach instead of the rediscretization. Note that the geometric approach is as effective as the Galerkin approach ( $\mathbf{\mathcal{P}}_{{\rm MGM}}$ and $\widetilde{\mathbf{\mathcal{P}}}_{{\rm MGM}}$ provide the same number of iterations). On the contrary, one step of V-cycle for the preconditioner $\mathbf{\mathcal{P}}_{2}$ is not as good as its direct inversion, i.e., $\widetilde{\mathbf{\mathcal{P}}}_{2}$ . Nevertheless, it shows a linear convergence rate and it is computationally very cheap since it is a pentadiagonal matrix. A good compromise could be to increase the number of iterations of the preconditioner $\mathbf{\mathcal{P}}_{2}$ , for instance, fixing $n=2^{7}$ and applying two steps of V-cycle instead of one, the number of iterations reduces from 17 to 14.

5.2 Example 2

This example is taken from [14]. We consider a FDE of type (2.1) with $\alpha=1.8,~{}\beta=1.9$ . The nonconstant diffusion coefficients are given by

[TABLE]

The spatial domain is $\Omega=[0,1]\times[0,1]$ , while the time interval is $[0,T]=[0,1]$ . The initial condition is

[TABLE]

and the source term is such that the solution to the FDE is given by $u(x,y,t)={\rm e}^{-t}x^{3}(1-x)^{3}y^{3}(1-y)^{3}$ .

Note that

[TABLE]

Since $\alpha=1.8$ and $\beta=1.9$ we obtain $\frac{1}{r}=O(n^{-0.8})$ and $\frac{s}{r}=O(n^{0.1})$ , i.e., when $n\rightarrow\infty$ , $\frac{1}{r}$ goes to zero and $\frac{s}{r}$ tends to infinity. On the other hand, $\frac{s}{r}$ grows very slowly and the numerical results are not affected by this small grow.

In Table 2 we compare the average number of iterations obtained by the GMRES and by our preconditioners with the performances of the preconditioner proposed in [14] and denoted by $P_{JLZ}$ . Moreover, we show the accuracy of the numerical solution. We observe that in this example the average number of iterations for the preconditioner $P_{JLZ}$ grows with the algebraic size of the problem. Conversely, our preconditioners show a linear convergence rate with a lower computational cost per iteration, especially the preconditioner $\mathbf{\mathcal{P}}_{2}$ which has only five nonzero diagonals and applies only two Jacobi iterations on each grid.

5.3 Example 3

Also this third example is taken from [14] and $\alpha,\beta$ are the same as in Example 2. The nonconstant diffusion coefficients are given by

[TABLE]

The spatial domain is $\Omega=[0,1]\times[0,1]$ , while the time interval is $[0,T]=[0,1]$ . The initial condition is

[TABLE]

and the source term is such that the solution of the FDE is given by $u(x,y,t)={\rm e}^{-t}x^{3}(1-x)^{3}y^{3}(1-y)^{3}$ .

The behaviour of $\frac{1}{r}$ and $\frac{s}{r}$ is the same of the previous example. The results in Table 3 are comparable to those of the previous example and again our preconditioners lead to a fast convergence with a number of iterations independent of the size $N$ .

6 Conclusions

In this paper we have investigated two-dimensional space-FDE problems discretized by means of a second order finite difference scheme obtained as combination of the Crank-Nicolson scheme and the so-called weighted and shifted Grünwald formula. We have provided a detailed spectral analysis of the coefficient matrix by means of its symbol, both in the constant and variable coefficients case. Thanks to the symbol analysis and the theory of multigrid methods for BTTB matrices, we have designed a classical multigrid method particularly effective for the considered problem. Two multigrid preconditioners for GMRES has been proposed and some numerical examples taken from the literature show that they provide a fast convergence independent of the algebraic size of the problem.

Multigrid methods have shown a good scalability in the dimensionality of the problem and they could be a good choice also for 3D and 4D problems since their extension is straightforward. Moreover, they could be effectively applied also to other discretization strategies, like the finite volume method proposed in [20].

Acknowledgment

The work of the last two authors is partly supported by the Italian grant MIUR - PRIN 2012 N. 2012MTE38N and by GNCS-INDAM (Italy).

Bibliography36

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Aricò, M. Donatelli, A V-cycle Multigrid for multilevel matrix algebras: proof of optimality , Numer. Math., 105 (2007) 511-547.
2[2] A. Aricò, M. Donatelli, S. Serra-Capizzano, V-cycle optimal convergence for certain (multilevel) structured linear systems , SIAM J. Matrix Anal. Appl., 26 (2004) 186-214.
3[3] O. Axelsson, G. Lindskog, On the rate of convergence of the preconditioned conjugate gradient method , Numer. Math., 48 (1986) 499-523.
4[4] J. Bai, X. Feng, Fractional-order anisotropic diffusion for image denoising , IEEE Tran. Image Process., 16 (2007) 2492-2502.
5[5] B. A. Carreras, V. E. Lynch, G. M. Zaslavsky, Anomalous diffusion and exit time distribution of particle tracers in plasma turbulence model , Phys. Plasmas, 8 (2001) 5096-5103.
6[6] R. H. Chan, Q. Chang, H. W. Sun, Multigrid method for ill-conditioned symmetric Toeplitz systems , SIAM J. Sci. Comput., 19 (1998) 516-529.
7[7] M. Chen, Y. Wang, X. Cheng, W. Deng, Second-order LOD multigrid method for multidimensional Riesz fractional diffusion equation , BIT Numer. Math., 54 (2014) 623-637.
8[8] M. Donatelli, M. Mazza, S. Serra-Capizzano, Spectral analysis and structure preserving preconditioners for fractional diffusion equations , J. Comput. Phys., 307 (2016) 262-279.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Spectral analysis and multigrid preconditioners for two-dimensional

Abstract

1 Introduction

2 Problem setting: the CN-WSGD scheme for 2D space-FDEs

Proposition 2.1**.**

3 Spectral analysis of the coefficient matrix

3.1 Constant diffusion coefficients case

Definition 3.1**.**

Definition 3.2**.**

Proposition 3.3**.**

Proof.

Corollary 3.4**.**

Proof.

Definition 3.5**.**

Theorem 3.6**.**

Theorem 3.7**.**

Proposition 3.8**.**

Proof.

Proposition 3.9**.**

Proof.

3.2 Nonconstant diffusion coefficients case

Definition 3.10**.**

Proposition 3.11**.**

Proof.

Proposition 3.12**.**

Proof.

4 Multigrid methods

Theorem 4.1**.**

4.1 Two-grid convergence analysis

Lemma 4.2**.**

Proof.

Proposition 4.3**.**

Lemma 4.4**.**

Proof.

Remark 4.5**.**

Theorem 4.6**.**

4.2 V-cycle and geometric mulgrid

Remark 4.7**.**

4.3 Preconditioning

5 Numerical results

5.1 Example 1

5.2 Example 2

5.3 Example 3

6 Conclusions

Acknowledgment

Proposition 2.1.

Definition 3.1.

Definition 3.2.

Proposition 3.3.

Corollary 3.4.

Definition 3.5.

Theorem 3.6.

Theorem 3.7.

Proposition 3.8.

Proposition 3.9.

Definition 3.10.

Proposition 3.11.

Proposition 3.12.

Theorem 4.1.

Lemma 4.2.

Proposition 4.3.

Lemma 4.4.

Remark 4.5.

Theorem 4.6.

Remark 4.7.