A preconditioning strategy for linear systems arising from nonsymmetric   schemes in isogeometric analysis

Mattia Tani

arXiv:1705.04384·math.NA·May 15, 2017·Comput. Math. Appl.

A preconditioning strategy for linear systems arising from nonsymmetric schemes in isogeometric analysis

Mattia Tani

PDF

Open Access

TL;DR

This paper introduces a preconditioning strategy for solving nonsymmetric linear systems from isogeometric analysis discretizations, improving efficiency and robustness in 2D and 3D problems.

Contribution

It extends a known tensor-structured solver preconditioning approach to nonsymmetric systems from collocation and weighted quadrature methods in isogeometric analysis.

Findings

01

Preconditioner is effective for 2D and 3D problems.

02

Preconditioner shows robustness with respect to mesh size and spline degree.

03

Numerical experiments confirm improved solver efficiency.

Abstract

In the context of isogeometric analysis, we consider two discretization approaches that make the resulting stiffness matrix nonsymmetric even if the differential operator is self-adjoint. These are the collocation method and the recently-developed weighted quadrature for the Galerkin discretization. In this paper, we are interested in the solution of the linear systems arising from the discretization of the Poisson problem using one of these approaches. In [SIAM J. Sci. Comput. 38(6) (2016) pp. A3644--A3671], a well-established direct solver for linear systems with tensor structure was used as a preconditioner in the context of Galerkin isogeometric analysis, yielding promising results. In particular, this preconditioner is robust with respect to the mesh size $h$ and the spline degree $p$ . In the present work, we discuss how a similar approach can applied to the considered nonsymmetric…

Tables5

Table 1. Table 1: Values of e h subscript 𝑒 ℎ e_{h} for different h ℎ h and p 𝑝 p for the quarter of ring domain and K ( 𝐱 ) ≡ I d 𝐾 𝐱 subscript 𝐼 𝑑 K(\mathbf{x})\equiv I_{d} .

	$h = 1 / 8$	$h = 1 / 16$	$h = 1 / 32$	$h = 1 / 64$
$p = 2$	$2.66 \cdot 10^{- 2}$	$1.41 \cdot 10^{- 2}$	$7.21 \cdot 10^{- 3}$	$3.65 \cdot 10^{- 3}$
$p = 3$	$2.40 \cdot 10^{- 2}$	$1.26 \cdot 10^{- 2}$	$6.43 \cdot 10^{- 3}$	$3.25 \cdot 10^{- 3}$
$p = 4$	$1.38 \cdot 10^{- 2}$	$7.18 \cdot 10^{- 3}$	$3.67 \cdot 10^{- 3}$	$1.86 \cdot 10^{- 3}$

Table 2. Table 2: Collocation approach, quarter of ring domain. Number of iterations and CPU time for BiCGStab preconditioned with FD (upper table) and ILU(0) (lower table).

	BiCGStab + $𝒫$ Iterations / Time (sec)
$h^{- 1}$	$p = 2$	$p = 3$	$p = 4$	$p = 5$
128	13.5 / 00.05	13.5 / 00.05	12.0 / 00.06	12.0 / 00.06
256	13.5 / 00.24	13.5 / 00.25	13.5 / 00.28	13.5 / 00.29
512	13.5 / 01.59	13.5 / 01.49	13.5 / 01.58	13.5 / 01.60
1024	13.5 / 10.28	13.5 / 10.18	13.5 / 10.35	13.5 / 10.14

Table 3. Table 3: WQ Galerkin approach, quarter of ring domain. Number of iterations and CPU time for BiCGStab preconditioned with FD (upper table) and ILU(0) (lower table).

	BiCGStab + $𝒫$ Iterations / Time (sec)
$h^{- 1}$	$p = 2$	$p = 3$	$p = 4$	$p = 5$
128	16.0 / 00.23	16.0 / 00.23	16.0 / 00.25	16.0 / 00.36
256	16.0 / 00.54	16.0 / 00.59	16.0 / 00.75	16.0 / 00.86
512	16.0 / 01.92	16.0 / 02.35	16.0 / 04.71	16.0 / 05.73
1024	16.0 / 14.12	16.0 / 12.83	16.0 / 20.97	16.0 / 23.57

Table 4. Table 4: Collocation approach, revolved ring domain. Number of iterations and CPU time for BiCGStab preconditioned with FD (upper table) and ILU(0) (lower table).

	BiCGStab + $𝒫$ Iterations / Time (sec)
$h^{- 1}$	$p = 2$	$p = 3$	$p = 4$	$p = 5$
16	16.0 / 0.04	15.5 / 0.04	17.5 / 0.05	17.5 / 0.07
32	16.5 / 0.11	18.5 / 0.13	20.5 / 0.32	22.0 / 0.40
64	17.5 / 0.88	19.5 / 1.04	21.5 / 2.47	22.5 / 3.12

Table 5. Table 5: WQ Galerkin approach, revolved ring domain. Number of iterations and CPU time for BiCGStab preconditioned with FD (upper table) and ILU(0) (lower table).

	BiCGStab + $𝒫$ Iterations / Time (sec)
$h^{- 1}$	$p = 2$	$p = 3$	$p = 4$	$p = 5$
16	27.5 / 00.10	27.0 / 00.16	25.5 / 000.24	23.5 / 00.42
32	29.5 / 00.44	29.5 / 01.08	29.5 / 002.35	29.5 / 04.48
64	31.5 / 04.17	26.5 / 06.52	26.5 / 013.69	26.0 / 24.80
128	32.5 / 42.95	30.0 / 76.73	27.5 / 162.59	*

Equations116

{- div (K (x) \nabla u (x)) = f (x) u = 0 on Ω, on \partial Ω,

{- div (K (x) \nabla u (x)) = f (x) u = 0 on Ω, on \partial Ω,

0 = ξ_{1} = \dots = ξ_{p + 1} < ξ_{p + 2} \leq \dots \leq ξ_{m} < ξ_{m + 1} = \dots = ξ_{m + p + 1} = 1.

0 = ξ_{1} = \dots = ξ_{p + 1} < ξ_{p + 2} \leq \dots \leq ξ_{m} < ξ_{m + 1} = \dots = ξ_{m + p + 1} = 1.

\hat{B}_{i} (ξ) = \hat{B}_{i_{1}, 1} (ξ_{1}) \dots \hat{B}_{i_{d}, d} (ξ_{d}),

\hat{B}_{i} (ξ) = \hat{B}_{i_{1}, 1} (ξ_{1}) \dots \hat{B}_{i_{d}, d} (ξ_{d}),

F (ξ) = i \sum C_{i} \hat{B}_{i} (ξ), C_{i} \in R^{d}

F (ξ) = i \sum C_{i} \hat{B}_{i} (ξ), C_{i} \in R^{d}

V_{h} = {\hat{B}_{i} \circ F^{- 1} ∣ i = (i_{1}, \dots, i_{d}), 1 \leq i_{l} \leq m - 2, 1 \leq l \leq d} .

V_{h} = {\hat{B}_{i} \circ F^{- 1} ∣ i = (i_{1}, \dots, i_{d}), 1 \leq i_{l} \leq m - 2, 1 \leq l \leq d} .

{\color[rgb]{0,0,0}\tau_{l,i}=\frac{1}{p}\sum_{j=1}^{p}\xi_{i+j+1}^{(l)},\qquad i=1,\ldots,n,}

{\color[rgb]{0,0,0}\tau_{l,i}=\frac{1}{p}\sum_{j=1}^{p}\xi_{i+j+1}^{(l)},\qquad i=1,\ldots,n,}

L u_{h} := - \mbox d i v (K \nabla u_{h}),

L u_{h} := - \mbox d i v (K \nabla u_{h}),

(L u_{h}^{C}) (F (τ_{i})) = f (F (τ_{i})), i = 1, \dots, N .

(L u_{h}^{C}) (F (τ_{i})) = f (F (τ_{i})), i = 1, \dots, N .

A_{C} u_{C} = f_{C},

A_{C} u_{C} = f_{C},

A_{C} = K_{2}^{C} \otimes M_{1}^{C} + M_{2}^{C} \otimes K_{1}^{C},

A_{C} = K_{2}^{C} \otimes M_{1}^{C} + M_{2}^{C} \otimes K_{1}^{C},

(M_{l}^{C})_{ij} = \hat{B}_{l, j} (τ_{l, i}), (K_{l}^{C})_{ij} = - \hat{B}_{l, j}^{^{''}} (τ_{l, i}), l = 1, 2, i, j = 1, \dots, n .

(M_{l}^{C})_{ij} = \hat{B}_{l, j} (τ_{l, i}), (K_{l}^{C})_{ij} = - \hat{B}_{l, j}^{^{''}} (τ_{l, i}), l = 1, 2, i, j = 1, \dots, n .

A_{C} = K_{3}^{C} \otimes M_{2}^{C} \otimes M_{1}^{C} + M_{3}^{C} \otimes K_{2}^{C} \otimes M_{1}^{C} + M_{3}^{C} \otimes M_{2}^{C} \otimes K_{1}^{C},

A_{C} = K_{3}^{C} \otimes M_{2}^{C} \otimes M_{1}^{C} + M_{3}^{C} \otimes K_{2}^{C} \otimes M_{1}^{C} + M_{3}^{C} \otimes M_{2}^{C} \otimes K_{1}^{C},

(A_{G})_{i j}

(A_{G})_{i j}

{\color[rgb]{0,0,0}Q(\xi)=\mbox{det}\left(J_{\textbf{F}}(\xi)\right)\;J_{\textbf{F}}\left(\xi\right)^{-1}K\left(\textbf{F}(\xi)\right)J_{\textbf{F}}\left(\xi\right)^{-T}.}

{\color[rgb]{0,0,0}Q(\xi)=\mbox{det}\left(J_{\textbf{F}}(\xi)\right)\;J_{\textbf{F}}\left(\xi\right)^{-1}K\left(\textbf{F}(\xi)\right)J_{\textbf{F}}\left(\xi\right)^{-T}.}

\left(\mathcal{A}_{wq}\right)_{\boldsymbol{i}\boldsymbol{j}}:=\sum_{\alpha,\beta=1}^{d}\sum_{\boldsymbol{q}}{\color[rgb]{0,0,0}c_{\alpha\beta}}(\boldsymbol{x}_{{\color[rgb]{0,0,0}\boldsymbol{q}}})\;w_{\boldsymbol{i}\boldsymbol{q}}^{\alpha\beta}\;\partial_{\beta}\hat{B}_{\boldsymbol{j}}(\boldsymbol{x}_{\boldsymbol{q}})\approx\left(\mathcal{A}_{G}\right)_{\boldsymbol{i}\boldsymbol{j}},

\left(\mathcal{A}_{wq}\right)_{\boldsymbol{i}\boldsymbol{j}}:=\sum_{\alpha,\beta=1}^{d}\sum_{\boldsymbol{q}}{\color[rgb]{0,0,0}c_{\alpha\beta}}(\boldsymbol{x}_{{\color[rgb]{0,0,0}\boldsymbol{q}}})\;w_{\boldsymbol{i}\boldsymbol{q}}^{\alpha\beta}\;\partial_{\beta}\hat{B}_{\boldsymbol{j}}(\boldsymbol{x}_{\boldsymbol{q}})\approx\left(\mathcal{A}_{G}\right)_{\boldsymbol{i}\boldsymbol{j}},

A_{w q} u_{w q} = f_{w q} .

A_{w q} u_{w q} = f_{w q} .

A_{w q} = K_{2}^{w q} \otimes M_{1}^{w q} + M_{2}^{w q} \otimes K_{1}^{w q},

A_{w q} = K_{2}^{w q} \otimes M_{1}^{w q} + M_{2}^{w q} \otimes K_{1}^{w q},

(M_{l}^{w q})_{ij} = \int_{0}^{1} \hat{B}_{l, i} (ξ) \hat{B}_{l, j} (ξ) d ξ, (K_{l}^{w q})_{ij} = \int_{0}^{1} \hat{B}_{l, i}^{^{'}} (ξ) \hat{B}_{l, j}^{^{'}} (ξ) d ξ,

(M_{l}^{w q})_{ij} = \int_{0}^{1} \hat{B}_{l, i} (ξ) \hat{B}_{l, j} (ξ) d ξ, (K_{l}^{w q})_{ij} = \int_{0}^{1} \hat{B}_{l, i}^{^{'}} (ξ) \hat{B}_{l, j}^{^{'}} (ξ) d ξ,

A_{w q} = K_{3}^{w q} \otimes M_{2}^{w q} \otimes M_{1}^{w q} + M_{3}^{w q} \otimes K_{2}^{w q} \otimes M_{1}^{w q} + M_{3}^{w q} \otimes M_{2}^{w q} \otimes K_{1}^{w q},

A_{w q} = K_{3}^{w q} \otimes M_{2}^{w q} \otimes M_{1}^{w q} + M_{3}^{w q} \otimes K_{2}^{w q} \otimes M_{1}^{w q} + M_{3}^{w q} \otimes M_{2}^{w q} \otimes K_{1}^{w q},

A u = f,

A u = f,

ξ \in Ω in f λ_{m i n} (Q (ξ)) \leq λ \leq ξ \in Ω sup λ_{m a x} (Q (ξ))

ξ \in Ω in f λ_{m i n} (Q (ξ)) \leq λ \leq ξ \in Ω sup λ_{m a x} (Q (ξ))

a(u_{h},v_{h})=\int_{\Omega}\left(\nabla u_{h}\left(\mathbf{x}\right)\right)^{T}K(\mathbf{x})\nabla v_{h}\left(\mathbf{x}\right)\;d\mathbf{x},\qquad{\color[rgb]{0,0,0}u_{h},v_{h}\in V_{h}.}

a(u_{h},v_{h})=\int_{\Omega}\left(\nabla u_{h}\left(\mathbf{x}\right)\right)^{T}K(\mathbf{x})\nabla v_{h}\left(\mathbf{x}\right)\;d\mathbf{x},\qquad{\color[rgb]{0,0,0}u_{h},v_{h}\in V_{h}.}

h ⟶ 0^{+} lim 0 \neq = v_{h}, w_{h} \in V_{h} sup \frac{∣ a _{h} ( w _{h} , v _{h} ) - a ( w _{h} , v _{h} ) ∣}{∥ w _{h} ∥ _{H^{1} (Ω)} ∥ v _{h} ∥ _{H^{1} (Ω)}} = 0.

h ⟶ 0^{+} lim 0 \neq = v_{h}, w_{h} \in V_{h} sup \frac{∣ a _{h} ( w _{h} , v _{h} ) - a ( w _{h} , v _{h} ) ∣}{∥ w _{h} ∥ _{H^{1} (Ω)} ∥ v _{h} ∥ _{H^{1} (Ω)}} = 0.

h ⟶ 0^{+} lim λ_{*} \in Λ_{*} max λ \in Λ min \frac{∣ λ - λ _{*} ∣}{λ} = 0.

h ⟶ 0^{+} lim λ_{*} \in Λ_{*} max λ \in Λ min \frac{∣ λ - λ _{*} ∣}{λ} = 0.

λ_{*} \in Λ_{*} max λ \in Λ min \frac{∣ λ - λ _{*} ∣}{λ} \leq A^{- 1/2} E A^{- 1/2},

λ_{*} \in Λ_{*} max λ \in Λ min \frac{∣ λ - λ _{*} ∣}{λ} \leq A^{- 1/2} E A^{- 1/2},

A^{- 1/2} E A^{- 1/2} = 0 \neq = v, w \in R^{N} max \frac{v ^{T} A ^{- 1/2} E A ^{- 1/2} w}{∥ v ∥ ∥ w ∥} = 0 \neq = w_{h}, v_{h} \in V_{h} sup \frac{∣ a _{h} ( w _{h} , v _{h} ) - a ( w _{h} , v _{h} ) ∣}{a ( v _{h} , v _{h} ) a ( w _{h} , w _{h} )} .

A^{- 1/2} E A^{- 1/2} = 0 \neq = v, w \in R^{N} max \frac{v ^{T} A ^{- 1/2} E A ^{- 1/2} w}{∥ v ∥ ∥ w ∥} = 0 \neq = w_{h}, v_{h} \in V_{h} sup \frac{∣ a _{h} ( w _{h} , v _{h} ) - a ( w _{h} , v _{h} ) ∣}{a ( v _{h} , v _{h} ) a ( w _{h} , w _{h} )} .

λ_{*} \in Λ_{*} max λ \in Λ min \frac{∣ λ - λ _{*} ∣}{λ} \leq c 0 \neq = v_{h}, w_{h} \in V_{h} sup \frac{∣ a _{h} ( w _{h} , v _{h} ) - a ( w _{h} , v _{h} ) ∣}{∥ w _{h} ∥ _{H^{1} (Ω)} ∥ v _{h} ∥ _{H^{1} (Ω)}} .

λ_{*} \in Λ_{*} max λ \in Λ min \frac{∣ λ - λ _{*} ∣}{λ} \leq c 0 \neq = v_{h}, w_{h} \in V_{h} sup \frac{∣ a _{h} ( w _{h} , v _{h} ) - a ( w _{h} , v _{h} ) ∣}{∥ w _{h} ∥ _{H^{1} (Ω)} ∥ v _{h} ∥ _{H^{1} (Ω)}} .

h ⟶ 0^{+} lim λ_{*} \in Λ_{*} max λ \in Λ min ∣ λ - λ_{*} ∣ = 0.

h ⟶ 0^{+} lim λ_{*} \in Λ_{*} max λ \in Λ min ∣ λ - λ_{*} ∣ = 0.

λ_{*} \in Λ_{*} max λ \in Λ min ∣ λ - λ_{*} ∣ \leq P^{- 1/2} E P^{- 1/2} .

λ_{*} \in Λ_{*} max λ \in Λ min ∣ λ - λ_{*} ∣ \leq P^{- 1/2} E P^{- 1/2} .

P^{- 1/2} E P^{- 1/2} = 0 \neq = v, w \in R^{N} max \frac{v ^{T} P ^{- 1/2} E P ^{- 1/2} w}{∥ v ∥ ∥ w ∥} = 0 \neq = v, w \in R^{N} max \frac{v ^{T} E w}{v ^{T} P v w ^{T} P w} .

P^{- 1/2} E P^{- 1/2} = 0 \neq = v, w \in R^{N} max \frac{v ^{T} P ^{- 1/2} E P ^{- 1/2} w}{∥ v ∥ ∥ w ∥} = 0 \neq = v, w \in R^{N} max \frac{v ^{T} E w}{v ^{T} P v w ^{T} P w} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Numerical Analysis Techniques · Polynomial and algebraic computation · Numerical methods in engineering

Full text

A preconditioning strategy for linear systems arising

from

nonsymmetric schemes in isogeometric analysis

Mattia Tani

Università di Pavia, Dipartimento di Matematica “F. Casorati”, via A. Ferrata 5, 27100 Pavia, Italy.

[email protected]

Abstract

In the context of isogeometric analysis, we consider two discretization approaches that make the resulting stiffness matrix nonsymmetric even if the differential operator is self-adjoint. These are the collocation method and the recently-developed weighted quadrature for the Galerkin discretization. In this paper, we are interested in the solution of the linear systems arising from the discretization of the Poisson problem using one of these approaches. In [SIAM J. Sci. Comput. 38(6) (2016) pp. A3644–A3671], a well-established direct solver for linear systems with tensor structure was used as a preconditioner in the context of Galerkin isogeometric analysis, yielding promising results. In particular, this preconditioner is robust with respect to the mesh size $h$ and the spline degree $p$ . In the present work, we discuss how a similar approach can applied to the considered nonsymmetric linear systems. The efficiency of the proposed preconditioning strategy is assessed with numerical experiments on two-dimensional and three-dimensional problems.

keywords:

isogeometric analysis , preconditioning , collocation , weighted quadrature

MSC:

[2016] 65N30 , 65F08 , 65F30

1 Introduction

Isogeometric analysis (IGA) is a method to numerically solve PDEs, which was first proposed in [1] as an attempt to create a link between the Computer Aided Design (CAD) and finite elements analysis. In IGA, the functions that are used to approximate the solution of the considered PDE are B-splines and their generalizations (e.g. NURBS). Differently from the $C^{0}$ piecewise polynomials used in the finite element method, such functions can have a high regularity. In particular, in the so-called $k$ -method, which is the primary focus of this work, $C^{p-1}$ piecewise polynomials (or rational functions in case of NURBS) of degree $p$ are considered. One important implication of the higher regularity of the basis functions is that they yield a higher accuracy per degree-of-freedom when compared to standard finite elements. On the other hand, they also yield a higher computational cost, at least in currently available software.

The computational core of any numerical solver for a linear PDE is typically represented by a linear algebra phase, where a linear system is assembled and then solved. In the IGA Galerkin method, the first task is considered particularly challenging from the computational point of view. Indeed, the common approach to assemble the system matrix, inherited by finite element codes, is to perform a loop over the elements. On each element the local integrals are approximated e.g. by Gaussian quadrature and then added to the global matrix. The computational cost of the overall process, in case of maximum regularity splines of degree $p$ , is $O(Np^{3d})$ FLOPS, where $N$ is the total number of degrees of freedom and $d$ is the dimension of the physical domain. Clearly, this cost makes the use of high-order splines unfeasible for practical problems.

One possible strategy to work around the problem is to abandon the Galerkin approach in favor of a collocation method [2, 3, 4]. Here the approximate solution is found by enforcing that the differential equation is satisfied on a given finite set of points. One of the appealing feature of this approach is that no quadrature is needed, and the formation of the system matrix requires only the evaluation of the basis functions (and their derivatives) at the collocation points.

Within the Galerkin framework, in the recent years a significant number of papers has focused on strategies that mitigate the cost of numerical quadrature, see e.g. [5, 6, 7, 8, 9, 10, 11].

A very recent strategy was presented in [12], where the idea of weighted quadrature (WQ) is introduced. In this approach, the matrix system is assembled row by row, rather than element by element. For each matrix row, the test function is incorporated into the integral measure. In this way, a row-dependent quadrature rule is derived by imposing some exactness conditions. If this strategy is combined with sum-factorization to exploit the tensor structure of the basis functions, the final cost of matrix assembly is $O(Np^{d+1})$ .

A common feature of the mentioned approaches, collocation and WQ Galerkin, is that the resulting system matrix is in general nonsymmetric, even when the underlying differential operator is self-adjoint. This lack of symmetry might be considered an unfavorable property when it comes to solving the linear system. This is true in particular for iterative solvers, which are the focus of this work. As far as we know, the only preconditioners that have been employed in the literature to tackle IGA collocation systems are: a simple yet effective incomplete LU factorization [4, 13], an Overlapping Schwarz method [14], and a multi-iterative preconditioner based on the analysis of the spectral symbol of the system matrix [15].

The Fast Diagonalization (FD) method [16, 17] is a well-established direct solver for linear systems having a special tensor structure. Recently, it has been shown [18] that the FD method is a very efficient and robust preconditioner for symmetric systems arising from the isogeometric discretization of elliptic problems. This approach fully exploits the tensor structure that is peculiar to spline basis functions.

The aim of the present work is to discuss the application of a preconditioning strategy based on the same ideas in the nonsymmetric setting. In particular, we discuss the spectrum of the preconditioned and unpreconditioned stiffness matrix arising from numerical quadrature of the stiffness matrix. We prove that under a reasonable assumption on the quadrature approach, which is numerically shown to be valid in case of WQ, the spectrum of such approximated matrices converges to the spectrum of the exact matrices. This means in particular that the proposed preconditioner, which is robust for the exact Galerkin matrix, is also robust for the approximated one. Concerning the application of the preconditioner, we present and discuss two popular algorithms, the Bartels- Stewart (BS) method [19] and the nonsymmetric version of the FD method. The discussion reveals that the latter is more suited for the particular features of IGA problems. We report numerical experiments using the FD method as preconditioner, which show that the proposed approach has the same appealing features as in the symmetric setting. In particular, the number of iterations on the preconditioned system is independent of the mesh size $h$ and of the spline degree $p$ . Moreover, the FD method in the 3D case is so fast that the effort required for the application of the preconditioner is negligible with respect to that of the matrix-vector product which is the fundamental step of any iterative method.

The outline of the paper is as follows. In Section 2 we set the notation and introduce the isogeometric collocation method and the weighted quadrature approach. In Section 3 we define the preconditioner and discuss its robustness. In Section 4 we describe the BS and FD methods, and discuss their computational cost. Numerical experiments with the FD preconditioner are reported in Section 5. Finally, in Section 6 we draw the conclusions.

2 Preliminaries

2.1 The problem

As a model problem, we consider the following Poisson problem with Dirichlet boundary conditions:

[TABLE]

where $\Omega$ is a domain of $\mathbb{R}^{d}$ ( $d=2,3$ ), and $K(\mathbf{x})$ is a $d\times d$ symmetric positive definite matrix for each $\mathbf{x}\in\Omega$ .

Given two positive integers $p$ and $m$ , for each direction $l=1,\ldots,d$ we consider the open knot vector $\Xi_{l}={\color[rgb]{0,0,0}\left\{\xi_{1}^{(l)},\ldots,\xi^{(l)}_{m+p+1}\right\}}$ , with

[TABLE]

where repeated knots are allowed, up to multiplicity $p$ . We denote with $\hat{B}_{l,i}$ , $i=0,1,\ldots,m-1$ , the univariate spline basis functions of degree $p$ corresponding to the knot vector $\Xi_{l}$ , $l=1,\ldots,d$ (we refer to [20, Chapter 2] for the details). These are piecewise polynomials of degree $p$ which are $C^{p-r}$ continuous at nodes with multiplicity $r$ . While we are mainly interested in the case of maximum regularity $p-1$ (i.e. all knots have multiplicity 1 except the first and the last), this is not restrictive for the presented approach and any continuity is allowed.

Multivariate B-splines are defined by tensor product of univariate B-splines. Given a multi-index $\boldsymbol{i}=(i_{1},\ldots,i_{d})$ , we define

[TABLE]

where $\xi=\left(\xi_{1},\ldots,\xi_{d}\right)$ . To avoid proliferation of notation, we assume that all the univariate spline spaces have the same order and the same dimension. We emphasize, however, that this is not restrictive for the approaches described in the present work.

For simplicity, we restrict to the case when $\Omega$ is given by a single-patch spline parametrization, i.e. we assume that $\Omega=\textbf{F}(\left[0,1\right]^{d})$ , where F has the form

[TABLE]

We consider the function search space for problem (2.1):

[TABLE]

Note that the homogeneous Dirichlet boundary conditions are incorporated in the definition of $V_{h}$ . If we define $n:=m-2$ , the number of degrees of freedom is given by $N=\mbox{dim}(V_{h})=n^{d}$ . In the following, with abuse of notation we will identify the multi-index $\boldsymbol{i}=(i_{1},\ldots,i_{d})$ with the scalar index ${\color[rgb]{0,0,0}i=1+\sum_{l=1}^{d}n^{l-1}(i_{l}-1)}$ .

2.2 The collocation method

We consider the collocation points $\tau_{\boldsymbol{i}}=\left(\tau_{1,i_{1}},\ldots,\tau_{d,i_{d}}\right)$ where $\tau_{l,i_{l}}\in\left[0,1\right]$ , $i_{l}=1,\ldots,n$ , $l=1,\ldots,d$ are suitably-chosen scalar points. The most common choice is the Greville abscissae, i.e.

[TABLE]

but other choices are possible, see e.g. [21, 22, 2, 13, 23, 24]. If, for any $u_{h}\in V_{h}$ , we define

[TABLE]

then the approximate solution $u_{h}^{C}\in V_{h}$ is defined by the conditions

[TABLE]

If $\textbf{u}_{C}\in\mathbb{R}^{N}$ represents the vector of coefficients of $u_{h}^{C}$ with respect to the spline basis (2.2), then the conditions (2.5) can be written as a linear system, that is

[TABLE]

where ${\color[rgb]{0,0,0}\textbf{f}_{C}=\left[f(\textbf{F}(\tau_{1})),\ldots,f(\textbf{F}(\tau_{N}))\right]^{T}}$ . We emphasize that the matrix system $\mathcal{A}_{C}$ is not symmetric.

The expression of $\mathcal{A}_{C}$ in the general case is complicated, and we do not report it. However, it greatly simplifies when F is the identity function (thus in particular $\Omega=[0,1]^{d}$ ), and $K(\mathbf{x})\equiv I_{d}$ . For this special case we write an explicit expression for $\mathcal{A}_{C}$ , as it will be useful in the following.

For $d=2$ , we have

[TABLE]

where $\otimes$ denotes the Kronecker product [25, Section 1.3.6], and the Kronecker factors are defined as:

[TABLE]

For $d=3$ we have instead

[TABLE]

with similar definition of the Kronecker factors.

2.3 Weighted quadrature

We consider the Galerkin projection of problem (2.1) on the space $V_{h}$ . The exact stiffness matrix, denoted by $\mathcal{A}_{G}$ , has the form

[TABLE]

Here matrix $Q(\xi)$ is given by

[TABLE]

where $J_{\textbf{F}}$ is the Jacobian of F, and the scalar-valued functions ${\color[rgb]{0,0,0}c_{\alpha,\beta}}$ , $\alpha,\beta=1,\ldots,d$ are the entries of $Q$ .

In practice, the integrals appearing in the entries of $\mathcal{A}_{G}$ have to be approximated by numerical quadrature. Following the weighted quadrature approach [12], we incorporate the function $\partial_{\alpha}\hat{B}_{\boldsymbol{i}}$ in the quadrature weights to get

[TABLE]

for some suitably-chosen quadrature points $\boldsymbol{x}_{\boldsymbol{q}}$ and weights $w_{\boldsymbol{i}\boldsymbol{q}}^{\alpha\beta}$ . In [12] the weights are chosen by imposing that quadrature approximations (2.11) are exact when the ${\color[rgb]{0,0,0}c_{\alpha\beta}}$ are constant functions. Thus in this case we have $\mathcal{A}_{wq}=\mathcal{A}_{G}$ . We emphasize that, even if $\mathcal{A}_{G}$ is symmetric, the approximation $\mathcal{A}_{wq}$ is not, in general. We refer to [12] for more details.

The linear system obtained with the WQ approach has the form

[TABLE]

Here $\textbf{f}_{wq}{\color[rgb]{0,0,0}\in\mathbb{R}^{N}}$ represents a suitable approximation of the Galerkin right-hand side, which can be obtained either by using again the WQ approach, or by some different quadrature approach (e.g. Gaussian quadrature). Similarly as before, a particularly interesting case occurs when F is the identity mapping and $K(\mathbf{x})\equiv I_{d}$ . Note that in this case $Q(\xi)\equiv I_{d}$ and the quadrature formulas (2.11) are exact. Thus for $d=2$ we have

[TABLE]

where

[TABLE]

for $l=1,2$ , $i,j=1,\ldots,n$ . Note in particular that $M_{1}^{wq}$ , $M_{2}^{wq}$ , $K_{1}^{wq}$ , $K_{2}^{wq}$ are symmetric and positive definite. For $d=3$ we have instead

[TABLE]

with similar definition of the Kronecker factors.

3 The preconditioner

We are interested in the iterative solution of the “abstract” linear system

[TABLE]

which can represent either the collocation system (2.6) or the WQ Galerkin system (2.12).

Since $\mathcal{A}$ is nonsymmetric, we consider a Krylov method for nonsymmetric systems, such as GMRES [26] or BICGStab [27]. Of course, in order to have an efficient and robust method, we need a good preconditioner.

The quality of a preconditioner $\mathcal{P}$ depends on two features. The first one is that a Krylov method applied to the preconditioned system should converge fast; it is typically required that the spectrum of $\mathcal{P}^{-1}\mathcal{A}$ is clustered. Even though in the nonsymmetric case this does not guarantee fast convergence from a theoretical point of view , for practical problems a clustered spectrum is typically associated with fast convergence. We say that a preconditioner is robust if the eigenvalues of $\mathcal{P}^{-1}\mathcal{A}$ are bounded away from 0 and infinity by constants that do not depend on the parameters of the problem.

In the spirit of the $k-$ method, we are primarily interested in a preconditioner which is robust with respect to both the spline degree $p$ and the mesh size $h$ . The second feature of a good preconditioner is that the cost of computing the action of $\mathcal{P}^{-1}$ on vectors should be low, e.g. comparable with the cost of computing matrix-vector products with $\mathcal{A}$ .

In the spirit of [18], we choose as preconditioner for $\mathcal{A}$ the matrix $\mathcal{P}$ which represents the same differential problem (2.1), discretized with the same approach as $\mathcal{A}$ (collocation or WQ Galerkin), but where the domain is replaced with the unit $d-$ dimensional cube $[0,1]^{d}$ . We remark that a similar strategy was already considered for collocation in [15], where $\mathcal{P}$ is referred to as the Parametric Laplacian matrix; in that work, however, the preconditioner is applied inexactly using a multi-iterative approach.

We now specialize the notation for $\mathcal{P}$ in the different settings, similarly as we did for the stiffness matrix $\mathcal{A}$ in the previous section. In the context on collocation, the preconditioner is denoted by $\mathcal{P}_{C}$ and it is defined by either (2.7) for a two-dimensional (2D) problem or (2.9) for a three-dimensional (3D) problem. In the context on WQ Galerkin, the preconditioner is denoted by $\mathcal{P}_{wq}$ and it is expressed by (2.7) for a 2D problem or by (2.9) for a 3D problem. Finally, in the context of Galerkin discretization with exact integration, the preconditioner is denoted by $\mathcal{P}_{G}$ . Note that $\mathcal{P}_{wq}=\mathcal{P}_{G}$ .

In the symmetric setting, a relevant bound for the conditioning of the preconditioned system can be derived with little effort. Namely, in [18] it was observed that, if $\lambda$ denotes any eigenvalue of $\mathcal{P}^{-1}_{G}\mathcal{A}_{G}$ , then

[TABLE]

where $Q(\xi)$ is the matrix defined in (2.10), while $\lambda_{\min}\left(\cdot\right)$ and $\lambda_{\max}\left(\cdot\right)$ denote respectively the minimum and maximum eigenvalue of a symmetric matrix. Assuming that $0<\inf_{\Omega}\lambda_{\min}(Q)\leq\sup_{\Omega}\lambda_{\max}(Q)<+\infty$ , this bound guarantees that a Krylov method such as CG or GMRES converges in a number of iterations that is bounded by a constant independent of $h$ and $p$ .

In the nonsymmetric setting, the spectral analysis of $\mathcal{P}^{-1}\mathcal{A}$ is not as simple as in the symmetric case, but it is still possible to obtain some results. In the case of collocation, we mention the symbol-based analysis presented in [15], where the authors recognize that the symbol of the preconditioned matrix is bounded as in (3.2). In the WQ approach, it is possible to analyze the eigenvalues of $\mathcal{P}^{-1}_{wq}\mathcal{A}_{wq}$ by considering such matrix a “perturbation” of $\mathcal{P}^{-1}_{G}\mathcal{A}_{G}$ . This will be the topic of the next section.

3.1 On the eigenvalues of approximated Galerkin matrices

We consider the general setting of a symmetric and uniformly coercive bilinear form $a(\cdot,\cdot)$ defined on the space $V_{h}$ . Note that the bilinear form associated with the weak formulation of problem (2.1) is

[TABLE]

We consider another bilinear form, denoted by $a_{h}(\cdot,\cdot)$ , which represents an approximation of $a(\cdot,\cdot)$ , stemming e.g. from numerical quadrature. Our results are based on the assumption that $a_{h}(\cdot,\cdot)$ satisfies the following property:

[TABLE]

Our first result refers to the eigenvalues of the approximated stiffness matrix.

Proposition 1

Let $a:V_{h}\times V_{h}\longrightarrow\mathbb{R}$ be a symmetric and uniformly coercive bilinear form, and let $a_{h}:V_{h}\times V_{h}\longrightarrow\mathbb{R}$ that satisfies property (3.4). Let $A$ and $A_{*}$ be $N\times N$ matrices associated to $a(\cdot,\cdot)$ and $a_{h}(\cdot,\cdot)$ , respectively, with respect to any basis of $V_{h}$ . If $\Lambda$ and $\Lambda_{*}$ denote respectively the spectra of $A$ and $A_{*}$ , then we have

[TABLE]

Proof 1

Let $E:=A-A_{*}$ . It holds that (see [28, Theorem 2.3])

[TABLE]

where, here and throughout, $\left\|\cdot\right\|$ denotes the (matrix or vector) 2-norm. It holds that

[TABLE]

Since $a(\cdot,\cdot)$ is uniformly coercive, there exits a constant $c>0$ independent of $h$ such that

[TABLE]

The sought-after result then immediately follows from (3.4).

The second result, which is proven using similar arguments, addresses the eigenvalues of the preconditioned matrix.

Proposition 2

Let $a(\cdot,\cdot)$ , $a_{h}(\cdot,\cdot)$ , $A$ and $A_{*}$ be as in the statement of Proposition 1. Moreover, let $P\in\mathbb{R}^{N\times N}$ be symmetric and positive definite, and assume that the eigenvalues of $P^{-1}A$ are bounded away from [math] independently of $h$ . If $\Lambda$ and $\Lambda_{*}$ denote respectively the spectra of $P^{-1}A$ and $P^{-1}A_{*}$ , then we have

[TABLE]

Proof 2

We prove that the eigenvalues of $P^{-1/2}AP^{-1/2}$ converge to those of $P^{-1/2}A_{*}P^{-1/2}$ . Then (3.5) immediately follows from a similarity argument. As before, if we let $E:=A-A_{*}$ , it holds that (see [29, Theorem III])

[TABLE]

We have

[TABLE]

The boundedness of the eigenvalues of $P^{-1}A$ and the uniform coercivity of $a(\cdot,\cdot)$ guarantee that there exists a constant $c>0$ independent of $h$ such that for any $v=[v_{1},\ldots,v_{N}]^{T}\in\mathbb{R}^{N}$ it holds that

[TABLE]

Thus, we have

[TABLE]

The sought-after result then immediately follows from (3.4).

Proposition 2 states in particular that if $P$ is a robust preconditioner for the exact Galerkin matrix, then $P$ is still a robust preconditioner for the approximated matrix. Thus, if (3.4) is satisfied by the WQ approach, then when solving iteratively the preconditioned system $\mathcal{P}_{wq}^{-1}\mathcal{A}_{wq}$ we can expect the same convergence behavior that is observed with the exact Galerkin matrix.

However, proving the validity of (3.4) in the case of weighted quadrature is beyond the scope of this work. Indeed, this property is linked to the optimal order of convergence of the WQ approach, and this will be the topic of a forthcoming paper. Instead, we give numerical evidence that (3.4) holds in our setting.

Let $H\in\mathbb{R}^{N\times N}$ be the symmetric and positive definite matrix that represents the standard $H^{1}$ -norm on $V_{h}$ in coordinates with respect to the chosen spline basis, i.e. for any $v=(v_{1},\ldots,v_{N})\in\mathbb{R}^{N}$ it holds

[TABLE]

In Table 1 we report the number

[TABLE]

where as before $\left\|\cdot\right\|$ denotes the 2-norm of a matrix, for different values of $h$ and $p$ in the case of the quarter of ring domain (see Section 5) and with $K(\mathbf{x})\equiv I_{d}$ . It is clear from these values that $e_{h}=O(h)$ for all $p$ . A similar behavior is observed using different domains.

3.2 Preconditioning for inherently nonsymmetric problems

Although the focus of this work is on self-adjoint problems which become nonsymmetric on the algebraic level due to the effects of discretization, we now briefly describe how the preconditioning approach discussed in this section can be adapted to inherently nonsymmetric problems, such as convection-diffusion problems. This case is not further explored in this paper.

Consider the nonsymmetric variant of problem (2.1) in which the differential operator is replaced with

[TABLE]

Here $\textbf{w}=(w_{1},\ldots,w_{d}):\Omega\longrightarrow\mathbb{R}^{d}$ represents the vector-valued “wind”.

Applying the ideas presented in this paper to build a preconditioner for this problem requires the approximation of $\mathcal{L}_{\textbf{w}}$ with a simpler operator. We emphasize that this approach has already been considered in the context of finite differences; we refer the interested reader to [30, 31, 32] and references therein.

Consider an isogeometric Galerkin discretization of (3.6), and assume for simplicity that the integrals are approximated with standard Gaussian quadrature. One approach is to build the preconditioner from $\mathcal{L}_{\textbf{w}}$ by simply discarding the convective term; in other words, $\mathcal{L}_{\textbf{w}}$ is approximated with the (negative) Laplacian. Discretization on the reference domain yields

[TABLE]

respectively for $d=2$ and $d=3$ , where

[TABLE]

for $i,j=1,\ldots,n$ , $l=1,\ldots,d$ . Then all the Kronecker factors of $\mathcal{P}$ are symmetric and positive definite, and the action of $\mathcal{P}^{-1}$ on a vector can be computed using the symmetric FD method as described in [18]. To improve the effectiveness of this strategy one can include partial information on the wind in the preconditioner. For example, consider a univariate approximation for the components of $\textbf{w}\circ\textbf{F}$ , namely

[TABLE]

and define the approximated operator $\mathcal{L}_{\widetilde{\textbf{w}}}=-\Delta+\widetilde{\textbf{w}}\cdot\nabla$ , with $\widetilde{\textbf{w}}=(\tilde{w}_{1},\ldots,\tilde{w}_{d})$ . By discretizing this operator on the reference domain, we obtain the preconditioner

[TABLE]

respectively for $d=2$ and $d=3$ , where $K_{l}$ , $M_{l}$ , $l=1,\ldots,d$ , are given by (3.8) and

[TABLE]

Note that these integrals cannot in general be computed exactly, due to the presence of $\tilde{w}_{l}$ , however for the sake of simplicity we ignore the (small) quadrature error. We emphasize that (3.9) is identical to (3.7), with the only difference that the factors $K_{l}$ are replaced with the nonsymmetric matrices $K_{l}+H_{l}$ . It follows that the algorithms described in the next section can be used to apply $\mathcal{P}^{-1}$ on a vector.

4 The BS and FD methods for nonsymmetric systems

We now turn on the fundamental problem of efficiently applying the preconditioner. At each iteration of a Krylov method the preconditioning step requires the solution of a linear system of the form

[TABLE]

with $\textbf{b}\in\mathbb{R}^{N}$ . We first consider the 2D case; system (4.1) takes the form

[TABLE]

where the Kronecker factors can represent either the collocation matrices (2.8), or the WQ Galerkin matrices (2.14).

In [18], a linear system analogous to (4.2) is considered, where the Kronecker factors $M_{1}$ , $M_{2}$ , $K_{1}$ and $K_{2}$ are symmetric and positive definite. Two well established methods, one iterative and one direct are discussed and compared. The results show that the direct solver, namely the FD method, is more suited than the other for solving IGA problems, in terms of plain efficiency. In fact, in the numerical experiments of [18] it is shown that the application of the FD preconditioner is even faster than the matrix-vector product with $\mathcal{A}$ , which is an unavoidable step at each iteration of a Krylov method.

Recall that in the WQ approach the Kronecker factors of (4.2) are symmetric and positive definite. Thus the FD method can be applied to (4.2) exactly as described in [18]. On the other hand, in the case of collocation the Kronecker factors are no longer symmetric. In the rest of this section we discuss how the FD method can be extended to the nonsymmetric case.

A popular way to solve a Sylvester equation like (4.2) is the BS method [19]. Since in our problem the matrices $M_{1}$ and $M_{2}$ are well-conditioned, we first simplify system (4.2) by premultiplying it by $\left(M_{2}\otimes M_{1}\right)^{-1}$ to get

[TABLE]

Assume for a moment that the eigenvalues of $M_{1}^{-1}K_{1}$ are real. Then there exist an orthogonal matrix $Q_{1}$ and an upper triangular matrix $R_{1}$ such that

[TABLE]

If $M_{1}^{-1}K_{1}$ has complex eigenvalues, then the above factorization is still possible in real arithmetic, with the difference that $R_{1}$ is block upper triangular with $1\times 1$ and $2\times 2$ diagonal blocks. The factorization (4.4) is called Schur decomposition. If we consider an analogous factorization for the pencil $(K_{2},M_{2})$ , then we can factorize the whole (4.3) as

[TABLE]

By inverting this equation and introducing the matrices $G_{l}:=M_{l}^{-T}Q_{l}$ , $l=1,2$ , we obtain:

[TABLE]

Thus, the solution s can be computed in the following four steps:

Compute the Schur decompositions $Q_{l}^{T}M_{l}^{-1}K_{l}Q_{l}=R_{l}$ and matrices $G_{l}=M_{l}^{-T}Q_{l}$ , $l=1,2$ . 2. 2.

Compute the matrix-vector product $\tilde{\textbf{b}}={\color[rgb]{0,0,0}\left(G_{2}\otimes G_{1}\right)^{T}}\textbf{b}$ . 3. 3.

Solve the system $\left(R_{2}\otimes I_{n}+I_{n}\otimes R_{1}\right)\tilde{\textbf{s}}=\tilde{\textbf{b}}$ . 4. 4.

Compute the matrix-vector product $\textbf{s}=\left(Q_{2}\otimes Q_{1}\right)\tilde{\textbf{s}}$ .

We emphasize that the matrix-vector products involving $Q_{2}\otimes Q_{1}$ and $\left(G_{2}\otimes G_{1}\right)^{T}$ can be computed efficiently by exploiting the properties of the Kronecker product. The system involving $(R_{2}\otimes I_{n}+I_{n}\otimes R_{1})$ can be solved by block backsubstitution; see the next section for details.

If the matrices $M^{-1}_{l}K_{l}$ , $l=1,\ldots,d$ , are diagonalizable (see Remark 1), another option is available. In this case we can apply to (4.2) the nonsymmetric version of the FD method. We mention that his approach has been pursued also in the context of spectral methods, see e.g. [33, Chapter 4] and references therein.

Consider the eigendecomposition

[TABLE]

where the columns of $U_{l}$ form a basis of eigenvectors for $M^{-1}_{l}K_{l}$ and $D_{l}$ is the diagonal matrix whose entries are the eigenvalues of $M^{-1}_{l}K_{l}$ .

We now define $V_{l}:=(M_{l}U_{l})^{-T}$ . Note that in the symmetric case, that is when $K_{l}$ is symmetric and $M_{l}$ is symmetric and positive definite, we can assume that the columns of $U_{l}$ are $M_{l}$ -orthonormal, which implies $V_{l}=U_{l}$ ; in this way we recover the symmetric FD method presented in [18]. It holds that

[TABLE]

which we can use to factorize (4.2) as

[TABLE]

By inverting this equation, we obtain:

[TABLE]

which suggests that the solution s can be computed in four steps which are analogous to the ones for the BS method. Note, however, that in this case the solution of the system at step 3 is just a diagonal scaling.

We remark that if $M^{-1}_{l}K_{l}$ has complex eigenvalues, then the eigendecomposition (4.6) can still be performed in real arithmetic by allowing $D_{l}$ to be block diagonal with $1\times 1$ and $2\times 2$ diagonal blocks.

We now turn on the 3D case, which is more interesting since 3D problems are considered more challenging from the computational point of view. In this case system (4.1) takes the form

[TABLE]

The BS and FD methods can be easily generalized to this case. Indeed, if we premultiply the above equation for $(M_{3}\otimes M_{2}\otimes M_{1})^{-1}$ and consider the factorization (4.4) for the matrix pencils $(K_{l},M_{l})$ , $l=1,2,3$ , then we obtain an equivalent of (4.5), i.e.

[TABLE]

Similarly, if the factorization (4.6) exists for $l=1,2,3$ , then we obtain an equivalent of (4.7)

[TABLE]

Remark 1

In the setting of collocation and WQ Galerkin, we have no guarantee that matrix $M^{-1}_{l}K_{l}$ is diagonalizable, so in principle the matrix $U_{l}$ appearing in (4.6) could be singular or very ill-conditioned. We remark in particular that the matrices $M^{-1}_{l}K_{l}$ have an “outlier” eigenvalue with algebraic multiplicity 2. However, in all our numerical tests for collocation, performed by choosing (2.4) as collocation points and using uniform knot vectors, we observed that $\left(M^{C}_{l}\right)^{-1}K^{C}_{l}$ is diagonalizable with well-conditioned eigenvector matrix. Moreover, its eigenvalues are all real. The reality of the eigenvalues has been proved for the case of cubic Hermite ( $C^{1}$ -continuous) splines in [34], for any choice of the collocation points. To the best of our knowledge, no proof for more general cases exists. As for WQ Galerkin, we similarly observed in all our numerical tests that ( $M^{wq}_{l})^{-1}K^{wq}_{l}$ is diagonalizable (with well-conditioned eigevector matrix) and has real eigenvalues. In view of these considerations, in the following we will always assume that the matrices $M^{-1}_{l}K_{l}$ are diagonalizable with real eigenvalues.

4.1 Computational cost

We now discuss the computational cost of the approaches described in the previous section. Since the solution of system (4.1) is a preconditioning step, we distinguish between the cost for the setup of the preconditioner, which has to be paid only once, and the cost for its application, which has to be paid at each iteration of the chosen iterative method.

We start by considering the 2D case. In the BS method, the setup of the preconditioner requires the computation of the Schur decompositions (4.4); the cost of each of them can be estimated roughly as $25n^{3}$ FLOPS [25, Section 7.5.6]. In the setup of the FD method, we instead need to compute the eigenvector matrix $U_{l}$ . The crucial step of this computation (see [25, Section 7.6]) is again the Schur decomposition, which has the same cost as above.

Computation of each matrix $G_{l}=M_{l}^{-T}Q_{l}$ , $l=1,2$ , requires the solution of $n$ linear systems involving the same system matrix. Since this matrix is banded with $O(p)$ bandwidth, the total cost of this operation is $O(n^{2}p)=O(Np)$ FLOPS, which is negligible. Similarly, the cost of computing $V_{l}=(M_{l}U_{l})^{-T}$ , $l=1,2$ , is negligible.

In both methods, the application of the preconditioner requires two matrix-vector products involving matrices in Kronecker form. By exploiting the properties of the Kronecker product (see [25, Section 1.3.6] for details), these operations can be reformulated in terms of a few matrix-matrix products, for a total cost of $8n^{3}$ FLOPS. We emphasize that a matrix-matrix product is a level 3 BLAS operation and typically yields high efficiency on modern computers, see [25, Chapter 1]. Application of the FD preconditioner also requires a diagonal scaling, which is inexpensive. In the BS approach, we instead need to solve a system of the form

[TABLE]

with $\textbf{y},\textbf{z}\in\mathbb{R}^{N}$ . We now give details on how this system is solved. If $\textbf{y}=\left(y_{1},\ldots,y_{N}\right)^{T}$ and $\textbf{z}=\left(z_{1},\ldots,z_{N}\right)^{T}$ , let $\textbf{y}_{k}=\left(y_{(k-1)n+1},\ldots,y_{kn}\right)^{T}$ , $\textbf{z}_{k}=\left(z_{(k-1)n+1},\ldots,z_{kn}\right)^{T}$ denote the $k$ -th block of $n$ components of y and z, for $k=1,\ldots,n$ . Then the last block of $n$ equations of (4.10) reads:

[TABLE]

which is an upper triangular system and can be solved by backsubstitution.

Similarly, once $\textbf{y}_{k+1},\ldots,\textbf{y}_{n}$ are known, $\textbf{y}_{k}$ can be recovered from the $k$ -th block of equations

[TABLE]

The total cost of this process is roughly $2n^{3}$ FLOPS.

We now turn to the 3D case. The setup costs of the two methods are similar to the 2D case, that is $O(n^{3})=O(N)$ FLOPS. We remark that, while for $d=2$ this was likely the main computational effort of the methods, for $d=3$ this cost is proportional to the number of degrees of freedom, so it is negligible. This fact makes the presented approaches more appealing in three dimensions than in two.

As for the application costs, computation of the matrix-vector products in and requires $12n^{4}$ FLOPS in both approaches. As in the 2D case, these products can be written in terms of highly efficient BLAS-3 operations.

This is roughly the total effort for the FD method, while in the BS method we have to add the cost of solving the system

[TABLE]

with $\textbf{y},\textbf{z}\in\mathbb{R}^{N}$ . This can done similarly as in previous case. Indeed, for $k=n,n-1,\ldots,1$ we consider the $k$ -th block of $n^{2}$ equations:

[TABLE]

with $\textbf{y}_{k}$ and $\textbf{z}_{k}$ properly defined. If the right-hand side is known, this system has the form (4.10), and can be solved as previously described. The total cost of this process is roughly $3n^{4}$ FLOPS.

We emphasize that all the relevant costs, in both 2D and 3D cases, are independent of $p$ . This makes the considered approaches very appealing in the context of $k$ -refinement.

We have observed that the BS method has a higher computational cost when compared to the FD method. In view of Remark 1, the BS method does not seem to offer any advantage that can motivate its employment in our setting. Therefore, only the FD method is considered in the rest of the paper. As a side note we also point out that, while the matrix-matrix products which are the computational core of the FD preconditioner are naturally parallelizable operations, the block backsubstitution process needed to solve (4.10) and (4.11) is not. This means that the FD method is more suited for parallelization than the BS method.

5 Numerical tests

In this section, we show numerically the potential of the FD method described in the previous section.

All the numerical experiments are performed in Matlab Version 8.5.0.197613 (R2015a), with the toolbox GeoPDEs [35], on a Intel Xeon i7-5820K processor, running at 3.30 GHz, and with 64 GB of RAM. Since we do not explore parallelization in this work, only one core is used for the experiments.

All problems were solved with the BiCGStab method [27] preconditioned with $\mathcal{P}$ (implemented by the FD method). We recall that each iteration of BiCGStab consists in a Biconjugate Gradient (BiCG) step followed by a single iteration of GMRES. The generalized eigendecompositions (4.6) are computed with the Matlab function eig. We remark that, even though in many cases the matrix pencils $(K_{l},M_{l})$ are the same for every $l=1,\ldots,d$ , we still compute the eigendecomposition for each $l$ , so that the computational effort reflects the more general case where these pencils are different. In the 3D version of the algorithm, the products involving Kronecker matrices are performed using the function from the free Matlab toolbox Tensorlab 3.0 [36].

For our experiments, we consider the problem (2.1) on one 2D domain and one 3D domain. These are a quarter of ring and the solid obtained from it by performing a $\pi/2$ revolution around the axis having direction $(0,1,0)^{T}$ and passing through $(-1,-1,-1)^{T}$ . Both domains are shown in Figure 1. In all problems, we set $K(\mathbf{x})\equiv I_{d}$ .

As a benchmark, we compare the performance of the FD preconditioner with that of the ILU(0) preconditioner, i.e. an Incomplete LU factorization with zero fill-in. We remark that this preconditioner was already considered in the context of isogeometric collocation in [4, 13]. We also remark that incomplete factorizations make good preconditioners for IGA problems obtained with $k-$ refinement. Indeed, while it is known that such approaches are not robust with respect to $h$ , they seem to be robust with respect to $p$ . This was recognized numerically in [37, 18], in the context of the Galerkin method. To improve the effectiveness of ILU(0), we use the reverse Cuthill-McKee algorithm [38] (implemented by MATLAB function symrcm) to preliminary reorder the entries of system matrix, as we numerically verified that this is beneficial for the preconditioning strategy.

In Tables 2 and 4 we report the results, in terms of BiCGStab iterations and computation times, for the collocation approach. The results relative to the WQ Galerkin discretization are shown in Tables 3 and 5. When convergence occurs halfway through iteration $k$ (i.e. after the $k-$ th BiCG step but before the $k-$ th GMRES step), we report $k+0.5$ as iteration number. We remark that computation times also include the time spent for the setup of the preconditioner. The symbol “*” denotes the cases in which assembling the system matrix $\mathcal{A}$ was unfeasible due to memory limitations.

The conclusions we can draw from the numerical results are similar for collocation and WQ Galerkin, and they mirror those made in [18] for the symmetric FD preconditioner. First, we observe that the number of FD-preconditioned BiCGStab iterations is almost constant with respect to $p$ and $h$ . Second, the computation times of this approach scales very well in $h$ . As a consequence, in all the considered cases the FD preconditioner outperforms ILU(0) for small enough $h$ . Third, while in the 2D case the computation time of the FD-preconditioned iteration increases only mildly with $p$ , in the 3D case it increases significantly. This is due to the $O(Np^{3})$ cost of the BiCGStab matrix-vector products, which dominates in this case. In other words, the FD method is so efficient that its computational effort is negligible with respect to the overall iterative procedure.

6 Conclusions

In this paper, we have addressed the problem of iteratively solving large linear systems that arise from the isogeometric discretization of the Poisson problem. More specifically, we have focused on the case when the system matrix $\mathcal{A}$ is obtained with a nonsymmetric discretization scheme, such as the collocation method or the weighted quadrature approximation of the Galerkin matrix.

We considered as preconditioner the matrix representing the same operator as $\mathcal{A}$ , but with no geometry and coefficients, discussed its robustness in the context of approximated Galerkin matrices (which is the case for WQ), and showed how this preconditioner can be efficiently applied using the FD method. The numerical experiments confirm that, as in the symmetric case explored in [18], this preconditioning strategy is robust with respect to $h$ and $p$ and so efficient that the time spent in the application of the preconditioner is negligible with respect to that spent in the matrix-vector products of the chosen iterative solver. We also remark that this strategy can be combined with the domain decomposition approach described in [18] to solve problems on multi-patch domains.

A current research direction is the theoretical study of the weighted quadrature, which as a by-product would lead to proving property (3.4). Another research direction is devoted to making the preconditioner $\mathcal{P}$ more robust with respect to the geometry and the coefficients. Indeed, as it is can be deduced from bound (3.2), the current approach likely deteriorates in the presence of an ill-conditioned (or singular) matrix $Q(\xi)$ . Finally, we are devising a matrix-free version of the WQ approach. Beside further reducing the time spent in the setup of the linear system, this approach could also be beneficial during the iterative solution process. Indeed, if the matrix-vector products could be computed faster, this would lead to a further improvement, in terms of efficiency, of the overall iterative process.

Acknowledgments

The author would like to thank Giancarlo Sangalli for fruitful discussions on the topics of the paper. The author was supported by the European Research Council through the FP7 Ideas Consolodator Grant HIGEOM n.616563. This support is gratefully acknowledged.

Bibliography38

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] T. J. R. Hughes, J. A. Cottrell, Y. Bazilevs, Isogeometric analysis: CAD, finite elements, NURBS, exact geometry and mesh refinement, Comput. Methods Appl. Mech. Engrg. 194 (39) (2005) 4135–4195.
2[2] F. Auricchio, L. B. Da Veiga, T. Hughes, A. Reali, G. Sangalli, Isogeometric collocation methods, Math. Models Methods Appl. Sci. 20 (11) (2010) 2075–2107.
3[3] F. Auricchio, L. B. Da Veiga, T. Hughes, A. Reali, G. Sangalli, Isogeometric collocation for elastostatics and explicit dynamics, Comput. Methods Appl. Mech. Engrg. 249 (2012) 2–14.
4[4] D. Schillinger, J. A. Evans, A. Reali, M. A. Scott, T. J. Hughes, Isogeometric collocation: Cost comparison with Galerkin methods and extension to adaptive hierarchical NURBS discretizations, Comput. Methods Appl. Mech. Engrg. 267 (2013) 170–232.
5[5] T. J. Hughes, A. Reali, G. Sangalli, Efficient quadrature for NURBS-based isogeometric analysis, Comput. Methods Appl. Mech. Engrg. 199 (5) (2010) 301–313.
6[6] F. Auricchio, F. Calabrò, T. Hughes, A. Reali, G. Sangalli, A simple algorithm for obtaining nearly optimal quadrature rules for NURBS-based isogeometric analysis, Comput. Methods Appl. Mech. Engrg. 249 (2012) 15–27.
7[7] D. Schillinger, S. J. Hossain, T. J. Hughes, Reduced Bézier element quadrature rules for quadratic and cubic splines in isogeometric analysis, Comput. Methods Appl. Mech. Engrg. 277 (2014) 1–45.
8[8] A. Mantzaflaris, B. Jüttler, B. N. Khoromskij, U. Langer, Matrix generation in isogeometric analysis by low rank tensor approximation, in: International Conference on Curves and Surfaces, Springer, 2014, pp. 321–340.