The surrogate matrix methodology: Low-cost assembly for isogeometric   analysis

Daniel Drzisga; Brendan Keith; Barbara Wohlmuth

arXiv:1904.06971·math.NA·August 11, 2020

The surrogate matrix methodology: Low-cost assembly for isogeometric analysis

Daniel Drzisga, Brendan Keith, Barbara Wohlmuth

PDF

TL;DR

This paper introduces a low-cost surrogate matrix methodology for isogeometric analysis that significantly reduces assembly time by performing limited quadrature and interpolating remaining matrix entries, with negligible impact on accuracy.

Contribution

The paper presents a novel surrogate matrix approach that reduces computational cost in IGA by combining selective quadrature with B-spline interpolation, easily integrable into existing software.

Findings

01

Over fifty-fold reduction in assembly time.

02

Negligible impact on solution accuracy.

03

Applicable to various PDE problems.

Abstract

A new methodology in isogeometric analysis (IGA) is presented. This methodology delivers low-cost variable-scale approximations (surrogates) of the matrices which IGA conventionally requires to be computed from element-scale quadrature formulas. To generate surrogate matrices, quadrature must only be performed on certain elements in the computational domain. This, in turn, determines only a subset of the entries in the final matrix. The remaining matrix entries are computed by a simple B-spline interpolation procedure. Poisson's equation, membrane vibration, plate bending, and Stokes' flow problems are studied. In these problems, the use of surrogate matrices has a negligible impact on solution accuracy. Because only a small fraction of the original quadrature must be performed, we are able to report beyond a fifty-fold reduction in overall assembly time in the same software. The…

Figures40

Click any figure to enlarge with its caption.

Equations132

[A_{I I} A_{D I} A_{I D} A_{D D}] [u_{I} u_{D}] = [f_{I} f_{D}] .

[A_{I I} A_{D I} A_{I D} A_{D D}] [u_{I} u_{D}] = [f_{I} f_{D}] .

a (v_{h}, w_{h}) - a (v_{h}, w_{h})

a (v_{h}, w_{h}) - a (v_{h}, w_{h})

\leq ∣ A - A ∣_{p max} i \sum j \in I (i) \sum ∣ v_{i} - v_{j} ∣ ∣ w_{i} - w_{j} ∣

\displaystyle\leq|\mathsf{A}-\widetilde{\mathsf{A}}|_{\operatorname*{\vphantom{p}max}}\sum_{i}\Bigg{(}\sum_{j\in\mathcal{I}(i)}|\mathsf{v}_{i}-\mathsf{v}_{j}|^{2}\Bigg{)}^{\mathchoice{\raisebox{0.0pt}{$\displaystyle{}\mathchoice{\raisebox{-0.2pt}{$\displaystyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\textstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptscriptstyle{}^{1}\!$}}/\mathchoice{\raisebox{-0.1pt}{$\displaystyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\textstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptscriptstyle{}_{\!2}$}}$}}{\raisebox{0.0pt}{$\textstyle{}\mathchoice{\raisebox{-0.2pt}{$\displaystyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\textstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptscriptstyle{}^{1}\!$}}/\mathchoice{\raisebox{-0.1pt}{$\displaystyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\textstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptscriptstyle{}_{\!2}$}}$}}{\raisebox{0.0pt}{$\scriptstyle{}\mathchoice{\raisebox{-0.2pt}{$\displaystyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\textstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptscriptstyle{}^{1}\!$}}/\mathchoice{\raisebox{-0.1pt}{$\displaystyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\textstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptscriptstyle{}_{\!2}$}}$}}{\raisebox{0.0pt}{$\scriptscriptstyle{}\mathchoice{\raisebox{-0.2pt}{$\displaystyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\textstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptscriptstyle{}^{1}\!$}}/\mathchoice{\raisebox{-0.1pt}{$\displaystyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\textstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptscriptstyle{}_{\!2}$}}$}}}\Bigg{(}\sum_{j\in\mathcal{I}(i)}|\mathsf{w}_{i}-\mathsf{w}_{j}|^{2}\Bigg{)}^{\mathchoice{\raisebox{0.0pt}{$\displaystyle{}\mathchoice{\raisebox{-0.2pt}{$\displaystyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\textstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptscriptstyle{}^{1}\!$}}/\mathchoice{\raisebox{-0.1pt}{$\displaystyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\textstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptscriptstyle{}_{\!2}$}}$}}{\raisebox{0.0pt}{$\textstyle{}\mathchoice{\raisebox{-0.2pt}{$\displaystyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\textstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptscriptstyle{}^{1}\!$}}/\mathchoice{\raisebox{-0.1pt}{$\displaystyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\textstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptscriptstyle{}_{\!2}$}}$}}{\raisebox{0.0pt}{$\scriptstyle{}\mathchoice{\raisebox{-0.2pt}{$\displaystyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\textstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptscriptstyle{}^{1}\!$}}/\mathchoice{\raisebox{-0.1pt}{$\displaystyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\textstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptscriptstyle{}_{\!2}$}}$}}{\raisebox{0.0pt}{$\scriptscriptstyle{}\mathchoice{\raisebox{-0.2pt}{$\displaystyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\textstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptstyle{}^{1}\!$}}{\raisebox{-0.2pt}{$\scriptscriptstyle{}^{1}\!$}}/\mathchoice{\raisebox{-0.1pt}{$\displaystyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\textstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptstyle{}_{\!2}$}}{\raisebox{-0.1pt}{$\scriptscriptstyle{}_{\!2}$}}$}}}.

∣ v ∣_{N_{i}} = ∥\nabla v_{h} ∥_{0, supp (N_{i})} .

∣ v ∣_{N_{i}} = ∥\nabla v_{h} ∥_{0, supp (N_{i})} .

Q_{i} = {v \in R^{N} : v_{j} = v_{i} for each j \in I (i)} .

Q_{i} = {v \in R^{N} : v_{j} = v_{i} for each j \in I (i)} .

∣ a (v_{h}, w_{h}) - a (v_{h}, w_{h}) ∣

∣ a (v_{h}, w_{h}) - a (v_{h}, w_{h}) ∣

Find u \in V satisfying

Find u \in V satisfying

Find u_{h} \in V_{h} satisfying

Find u_{h} \in V_{h} satisfying

N_{i} (x) = \frac{w _{i} B _{i} ( x )}{\sum _{j} w _{j} B _{j} ( x )}, x = (x_{1}, \dots, x_{n}) \in Ω,

N_{i} (x) = \frac{w _{i} B _{i} ( x )}{\sum _{j} w _{j} B _{j} ( x )}, x = (x_{1}, \dots, x_{n}) \in Ω,

\Xi^{(p)}=\{\underbrace{0,\cdots,0}_{\text{$p$ times}}\}\cup\Big{\{}\frac{k}{19}\Big{\}}_{k=0}^{19}\cup\{\underbrace{1,\cdots,1}_{\text{$p$ times}}\}\,.

\Xi^{(p)}=\{\underbrace{0,\cdots,0}_{\text{$p$ times}}\}\cup\Big{\{}\frac{k}{19}\Big{\}}_{k=0}^{19}\cup\{\underbrace{1,\cdots,1}_{\text{$p$ times}}\}\,.

Au = f and A u = f,

Au = f and A u = f,

a (ϕ_{j}, ϕ_{i}) = a (ϕ (\cdot - x_{j}), ϕ (\cdot - x_{i})) := Φ (x_{j}, x_{i}) .

a (ϕ_{j}, ϕ_{i}) = a (ϕ (\cdot - x_{j}), ϕ (\cdot - x_{i})) := Φ (x_{j}, x_{i}) .

A_{ij} = Φ_{δ} (x_{i})

A_{ij} = Φ_{δ} (x_{i})

A_{ij} = Φ_{δ} (x_{i}) .

A_{ij} = Φ_{δ} (x_{i}) .

a (w, v) = \int_{Ω} \nabla w (x)^{⊤} K (x) \nabla v (x) d x, where K = \frac{D φ ^{- 1} D φ ^{- ⊤}}{∣ det ( D φ ^{- 1} ) ∣},

a (w, v) = \int_{Ω} \nabla w (x)^{⊤} K (x) \nabla v (x) d x, where K = \frac{D φ ^{- 1} D φ ^{- ⊤}}{∣ det ( D φ ^{- 1} ) ∣},

A_{ij} = \int_{Ω} \nabla B (x - x_{i})^{⊤} K (x) \nabla B (x - x_{j}) d x

A_{ij} = \int_{Ω} \nabla B (x - x_{i})^{⊤} K (x) \nabla B (x - x_{j}) d x

Φ_{δ} (x) = \int_{ω_{δ}} \nabla B (y)^{⊤} K (x + y) \nabla B_{δ} (y) d y .

Φ_{δ} (x) = \int_{ω_{δ}} \nabla B (y)^{⊤} K (x + y) \nabla B_{δ} (y) d y .

A_{ij} = (N_{j}, N_{i})

A_{ij} = (N_{j}, N_{i})

\Phi_{\bm{\delta}}(\widetilde{\bm{x}})=w(\widetilde{\bm{x}})w(\widetilde{\bm{x}}+{\bm{\delta}})\int_{\widehat{\omega}_{{\bm{\delta}}}}\widehat{\nabla}\bigg{(}\frac{\widehat{B}(\widehat{\bm{y}})}{W(\widetilde{\bm{x}}+\widehat{\bm{y}})}\bigg{)}^{\top}{K}(\widetilde{\bm{x}}+\widehat{\bm{y}})\,\widehat{\nabla}\bigg{(}\frac{\widehat{B}_{\bm{\delta}}(\widehat{\bm{y}})}{W(\widetilde{\bm{x}}+\widehat{\bm{y}})}\bigg{)}\,\mathrm{d}\widehat{\bm{y}},

\Phi_{\bm{\delta}}(\widetilde{\bm{x}})=w(\widetilde{\bm{x}})w(\widetilde{\bm{x}}+{\bm{\delta}})\int_{\widehat{\omega}_{{\bm{\delta}}}}\widehat{\nabla}\bigg{(}\frac{\widehat{B}(\widehat{\bm{y}})}{W(\widetilde{\bm{x}}+\widehat{\bm{y}})}\bigg{)}^{\top}{K}(\widetilde{\bm{x}}+\widehat{\bm{y}})\,\widehat{\nabla}\bigg{(}\frac{\widehat{B}_{\bm{\delta}}(\widehat{\bm{y}})}{W(\widetilde{\bm{x}}+\widehat{\bm{y}})}\bigg{)}\,\mathrm{d}\widehat{\bm{y}},

Φ_{δ} (x_{i}) = Φ (x_{i} + δ, x_{i}) = Φ (x_{j} - δ, x_{j}) = Φ_{- δ} (x_{j}) .

Φ_{δ} (x_{i}) = Φ (x_{i} + δ, x_{i}) = Φ (x_{j} - δ, x_{j}) = Φ_{- δ} (x_{j}) .

\widetilde{\Omega}=\bigg{[}\frac{3p+1}{2(m-p)},1-\frac{3p+1}{2(m-p)}\bigg{]}^{n}\subsetneq\widehat{\Omega}\,.

\widetilde{\Omega}=\bigg{[}\frac{3p+1}{2(m-p)},1-\frac{3p+1}{2(m-p)}\bigg{]}^{n}\subsetneq\widehat{\Omega}\,.

∥ f - Π_{H} f ∥_{L^{\infty} (Ω)} \leq C_{1} H^{q + 1} [f]_{W^{q + 1, \infty} (Ω)}, for all f \in W^{q + 1, \infty} (Ω),

∥ f - Π_{H} f ∥_{L^{\infty} (Ω)} \leq C_{1} H^{q + 1} [f]_{W^{q + 1, \infty} (Ω)}, for all f \in W^{q + 1, \infty} (Ω),

\big{\|}\Phi_{\bm{\delta}}-\widetilde{\Phi}_{\bm{\delta}}\big{\|}_{L^{\infty}(\widetilde{\Omega})}\leq C_{2}\hskip 0.25pth^{n-2}H^{q+1}\quad\text{for each }{\bm{\delta}}\in\mathscr{D}\,.

\big{\|}\Phi_{\bm{\delta}}-\widetilde{\Phi}_{\bm{\delta}}\big{\|}_{L^{\infty}(\widetilde{\Omega})}\leq C_{2}\hskip 0.25pth^{n-2}H^{q+1}\quad\text{for each }{\bm{\delta}}\in\mathscr{D}\,.

D^{α} Φ_{δ} (x) = \int_{ω_{δ}} \nabla B (y)^{⊤} D^{α} K (x + y) \nabla B_{δ} (y) d y .

D^{α} Φ_{δ} (x) = \int_{ω_{δ}} \nabla B (y)^{⊤} D^{α} K (x + y) \nabla B_{δ} (y) d y .

A_{ij} = {Φ_{δ} (x_{i}) A_{ij} if x_{i}, x_{j} \in Ω, otherwise.

A_{ij} = {Φ_{δ} (x_{i}) A_{ij} if x_{i}, x_{j} \in Ω, otherwise.

A_{ij} = ⎩ ⎨ ⎧ Φ_{δ} (x_{i}) A_{j i} A_{ij} if x_{i}, x_{j} \in Ω and i \leq j, if x_{i}, x_{j} \in Ω and i > j, otherwise.

A_{ij} = ⎩ ⎨ ⎧ Φ_{δ} (x_{i}) A_{j i} A_{ij} if x_{i}, x_{j} \in Ω and i \leq j, if x_{i}, x_{j} \in Ω and i > j, otherwise.

A_{ij} = ⎩ ⎨ ⎧ Φ_{δ} (x_{i}) A_{j i} A_{ij} - \sum_{k \neq = i} A_{ik} if x_{i}, x_{j} \in Ω and i < j, if x_{i}, x_{j} \in Ω and i > j, in all other cases where i \neq = j if i = j,

A_{ij} = ⎩ ⎨ ⎧ Φ_{δ} (x_{i}) A_{j i} A_{ij} - \sum_{k \neq = i} A_{ik} if x_{i}, x_{j} \in Ω and i < j, if x_{i}, x_{j} \in Ω and i > j, in all other cases where i \neq = j if i = j,

a (w, v) = \int_{Ω} G (x, (x), (x)) d x for all w, v \in V,

a (w, v) = \int_{Ω} G (x, (x), (x)) d x for all w, v \in V,

G (x, w (y), v (y)) = 0, whenever y \in / supp (w) \cap supp (v) .

A_{ij} = (B_{j}, B_{i})

A_{ij} = (B_{j}, B_{i})

Φ_{δ} (x) = \int_{ω_{δ}} G (x + y, B_{δ} (y), B (y)) d y .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\NewEnviron

scaletikzpicturetowidth[1]\BODY

The surrogate matrix methodology: Low-cost assembly for isogeometric analysis

Daniel Drzisga Lehrstuhl für Numerische Mathematik, Fakultät für Mathematik (M2), Technische Universität München, Garching bei München (, , ) [email protected]

[email protected]

Brendan Keith11footnotemark: 1

Barbara Wohlmuth11footnotemark: 1

(with support in $(-\frac{p+1}{2}\cdot{h},\frac{p+1}{2}\cdot{h})$ ; The $\widetilde{x}^{(k)}$ correspond to the midpoints of each function ${b}_{k}$ .; We define; by; ^a; as; Figure 7 presents graphs of stencil functions, $\Phi_{\bm{\delta}}(\widetilde{\bm{x}})$ and their surrogates $\widetilde{\Phi}_{\bm{\delta}}(\widetilde{\bm{x}})$ obtained by B-spline interpolation for various ${\bm{\delta}}$ . The stencil functions are generated by an isogeometric NURBS basis and the symmetric bilinear form 8.; ${\bm{\alpha}}=(\alpha_{1},\ldots,\alpha_{n})$ , with each $0\leq\alpha_{i}\leq q+1$ ; and, moreover, $\Phi_{\bm{\delta}}\in W^{q+1,\infty}(\widetilde{\Omega})$ . The result follows from Lemma 4.1 using $f=\Phi_{\bm{\delta}}$ .; will; ^w; ^v; ^a; ^ωδ; First, the sampling length $H$ and the sampling points need to be specified. For this purpose, we introduce the sampling parameter $M ̄\in\mathbb{N}$ which relates the small scale $h$ to the coarse scale $H$ via $H=M\cdot h$ .; out; needs; Some of the diagonal entries are drawn in blue due to the modification of preserving the kernel.; Some of the diagonal entries are drawn in blue due to the modification of preserving the kernel.; *Dirichlet boundary conditions are enforced for surrogate methods in the standard way; that is, by eliminating dofs from the original linear system. As usual, let $\widetilde{\mathsf{A}}$ be the full matrix without any consideration for boundary conditions. Without loss of generality, we may assume that all the smallest global indices correspond to the set of interior dofs, denoted by $I$ , and all of the largest correspond to the set of Dirichlet dofs, denoted by $D$ . The unconstrained linear system, $\widetilde{\mathsf{A}}\widetilde{\mathsf{u}}=\mathsf{f}$ , may then be written in block form as follows:

(23)

$\displaystyle\begin{bmatrix}\widetilde{\mathsf{A}}_{II}&\widetilde{\mathsf{A}}_{ID}\\ \widetilde{\mathsf{A}}_{DI}&\widetilde{\mathsf{A}}_{DD}\end{bmatrix}\begin{bmatrix}\widetilde{\mathsf{u}}_{I}\\ \widetilde{\mathsf{u}}_{D}\end{bmatrix}=\begin{bmatrix}\mathsf{f}_{I}\\ \mathsf{f}_{D}\end{bmatrix}.$

Recall that all cardinal basis functions vanish at domain boundary $\partial\Omega$ . Therefore, following definition 18c, $\widetilde{\mathsf{A}}_{ID}={\mathsf{A}}_{ID}$ and $\widetilde{\mathsf{A}}_{DD}={\mathsf{A}}_{DD}$ , where ${\mathsf{A}}_{ID}$ and ${\mathsf{A}}_{DD}$ are the corresponding submatrices of the standard stiffness matrix $\mathsf{A}$ . Furthermore, because the values of $\widetilde{\mathsf{u}}_{D}$ are prescribed, 23 may be reduced to the linear system $\widetilde{\mathsf{A}}_{II}\widetilde{\mathsf{u}}_{I}=\mathsf{f}_{I}-{\mathsf{A}}_{ID}\widetilde{\mathsf{u}}_{D}$ . This reduced system may be solved to determine all the unprescribed solution coefficients, $\widetilde{\mathsf{u}}_{I}$ . *; * In this proof, we will use the symbols “ $\lesssim$ ” and “ $\eqsim$ ” to denote upper bounds and equivalence, respectively, up to constants depending at most on $p$ and $\bm{\varphi}$ . For each $i=1,\ldots,N$ , let $\mathcal{I}(i)$ be the set of indices $j$ such that $\operatorname*{supp}({N}_{i})\cap\operatorname*{supp}({N}_{j})\neq\emptyset$ and notice that $\mathcal{I}(i)\leq|\mathscr{D}|=(2p+1)^{n}$ . We begin with the observation that $\sum_{i}\big{(}\mathsf{A}_{ij}-\widetilde{\mathsf{A}}_{ij}\big{)}=\sum_{j}\big{(}\mathsf{A}_{ij}-\widetilde{\mathsf{A}}_{ij}\big{)}=0$ . With these two identities in hand, we find that

(27)

$\displaystyle a(v_{h},w_{h})-\widetilde{a}(v_{h},w_{h})$ $\displaystyle=-\frac{1}{2}\,{\sum_{i,j}}\big{(}\mathsf{A}_{ij}-\widetilde{\mathsf{A}}_{ij}\big{)}(\mathsf{v}_{i}-\mathsf{v}_{j})\hskip 0.25pt({\mathsf{w}_{i}}-\mathsf{w}_{j})$

$\displaystyle\leq|\mathsf{A}-\widetilde{\mathsf{A}}|_{\operatorname*{\vphantom{p}max}}\sum_{i}\sum_{j\in\mathcal{I}(i)}|\mathsf{v}_{i}-\mathsf{v}_{j}|\hskip 0.25pt|\mathsf{w}_{i}-\mathsf{w}_{j}|$

$\displaystyle\leq|\mathsf{A}-\widetilde{\mathsf{A}}|_{\operatorname*{\vphantom{p}max}}\sum_{i}\Bigg{(}\sum_{j\in\mathcal{I}(i)}|\mathsf{v}_{i}-\mathsf{v}_{j}|^{2}\Bigg{)}^{\mathchoice{\raisebox{0.0pt}{$ \displaystyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \textstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \scriptstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \scriptscriptstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}}\Bigg{(}\sum_{j\in\mathcal{I}(i)}|\mathsf{w}_{i}-\mathsf{w}_{j}|^{2}\Bigg{)}^{\mathchoice{\raisebox{0.0pt}{$ \displaystyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \textstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \scriptstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \scriptscriptstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}}.$

For the time being, fix the index $i=1,\ldots,N$ . For each coefficient vector $\mathsf{v}\in\mathbb{R}^{N}$ , define $|\mathsf{v}|_{\mathcal{I}(i)}=\big{(}\sum_{j\in\mathcal{I}(i)}|\mathsf{v}_{i}-\mathsf{v}_{j}|^{2}\big{)}^{1/2}$ . One may easily check that $|\,\cdot\,|_{\mathcal{I}(i)}$ is a seminorm on $\mathbb{R}^{N}$ . In order to identify its kernel, simply observe that $|\mathsf{v}|_{\mathcal{I}(i)}=0$ iff $\mathsf{v}_{j}=\mathsf{v}_{i}$ for each coefficient $j\in\mathcal{I}(i)$ . Since $i\in\mathcal{I}(i)$ , an equivalent way of stating this condition is that there exists some constant $C\in\mathbb{R}$ such that $\mathsf{v}_{j}=C$ for each $j\in\mathcal{I}(i)$ . Now, recall that ${v}_{h}({\bm{x}})=\sum_{i}\mathsf{v}_{i}{N}_{i}({\bm{x}})$ for all ${\bm{x}}\in{\Omega}$ and consider the following alternative seminorm:

(28)

$|\mathsf{v}|_{{N}_{i}}=\|\nabla{v}_{h}\|_{0,\operatorname*{supp}({N}_{i})}\,.$

Notice that $|\mathsf{v}|_{{N}_{i}}=0$ iff ${v}_{h}({\bm{x}})$ is equal to a constant on $\operatorname*{supp}({N}_{i})$ , say $C\in\mathbb{R}$ . From the partition of unity property inherent to all NURBS bases, it must hold that $C={v}_{h}(\bm{x})=\sum_{j\in\mathcal{I}(i)}C{N}_{j}(\bm{x})$ for all $\bm{x}\in\operatorname*{supp}({N}_{i})$ . In other words, since $\{{N}_{i}|_{\operatorname*{supp}(N_{i})}\}_{j\in\mathcal{I}(i)}$ is a linearly independent set, $|\mathsf{v}|_{{N}_{i}}=0$ iff $\mathsf{v}_{j}=C$ for each $j\in\mathcal{I}(i)$ . In the previous paragraph, we showed that the kernels of $|\mathsf{v}|_{{N}_{i}}$ and $|\mathsf{v}|_{\mathcal{I}(i)}$ are identical; namely, $|\mathsf{v}|_{{N}_{i}}=0$ iff $|\mathsf{v}|_{\mathcal{I}(i)}=0$ iff $\mathsf{v}\in Q_{i}$ , where

(29)

$Q_{i}=\{\mathsf{v}\in\mathbb{R}^{N}\colon\mathsf{v}_{j}=\mathsf{v}_{i}\text{ for each }j\in\mathcal{I}(i)\}\,.$

Clearly, $|\,\cdot\,|_{{N}_{i}}$ and $|\,\cdot\,|_{\mathcal{I}(i)}$ induce norms on the quotient space $\mathbb{R}^{N}/Q_{i}$ . The next important observation is that $|Q_{i}|=N+1-|\mathcal{I}(i)|$ , which may be witnessed by inspection. Because $\mathrm{dim}(\mathbb{R}^{N}/Q_{i})=|\mathcal{I}(i)|-1\leq(2p+1)^{n}-1$ is finite and depends only on $p$ , it follows from the well-known equivalence of norms on finite dimensional vector spaces (e.g., the vector space $\mathbb{R}^{N}/Q_{i}$ ) that the seminorms $|\,\cdot\,|_{{N}_{i}}$ and $|\,\cdot\,|_{\mathcal{I}(i)}$ are equivalent. Of course, the corresponding equivalence constants will depend on $h$ , $\bm{\varphi}$ , and $p$ . Nevertheless, a standard scaling argument is all that is required to see that $|\mathsf{v}|_{\mathcal{I}(i)}\eqsim h^{2-n}|\mathsf{v}|_{{N}_{i}}$ . Therefore, employing 27, we may simply write

(30)

$\displaystyle|a(v_{h},w_{h})-\widetilde{a}(v_{h},w_{h})|$ $\displaystyle\lesssim h^{2-n}|\mathsf{A}-\widetilde{\mathsf{A}}|_{\operatorname*{\vphantom{p}max}}\sum_{i}\big{\|}\nabla v_{h}\big{\|}_{0,\operatorname*{supp}({N}_{i})}\big{\|}\nabla w_{h}\big{\|}_{0,\operatorname*{supp}({N}_{i})}\,.$

The proof is completed by applying the discrete Cauchy–Schwarz inequality to the right-hand side of the inequality above and employing the fact that $\big{(}\sum_{i}\|f\|_{0,\operatorname*{supp}({N}_{i})}^{2}\big{)}^{\mathchoice{\raisebox{0.0pt}{$ \displaystyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \textstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \scriptstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \scriptscriptstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}}\eqsim\|f\|_{0}$ , for all $f\in L^{2}(\Omega)$ . *; ; see, e.g., [17, Chapter 2.5].; with using SciPy; BLAS library in version 3.7.1; Ubuntu 18.04.2; Let $t_{\mathrm{std}}$ be the time required to assemble the matrix with the standard approach and $t_{\mathrm{surr}}$ the time using the surrogate approach. The speed-up from using the surrogate approach is then defined as $\frac{t_{\mathrm{std}}}{t_{\mathrm{surr}}}-1$ .; percentage; percentage; percentage; percentage)

Abstract

A new methodology in isogeometric analysis (IGA) is presented. This methodology delivers low-cost variable-scale approximations (surrogates) of the matrices which IGA conventionally requires to be computed from element-scale quadrature formulas. To generate surrogate matrices, quadrature must only be performed on certain elements in the computational domain. This, in turn, determines only a subset of the entries in the final matrix. The remaining matrix entries are computed by a simple B-spline interpolation procedure. Poisson’s equation, membrane vibration, plate bending, and Stokes’ flow problems are studied. In these problems, the use of surrogate matrices has a negligible impact on solution accuracy. Because only a small fraction of the original quadrature must be performed, we are able to report beyond a fifty-fold reduction in overall assembly time in the same software. The capacity for even further speed-ups is clearly demonstrated. The implementation used here was achieved by a small number of modifications to the open-source IGA software library GeoPDEs. Similar modifications could be made to other present-day software libraries.

keywords:

Assembly, surrogate numerical methods, isogeometric analysis, a priori analysis.

1 Introduction

Avoiding unnecessary work is of utmost importance when computing at the frontiers of contemporary research. To frame a workable definition, recall that practical simulations in science and engineering involve a large number of possible sources of error. For instance, we highlight the categories of modeling error, numerical error, and data error, each of which have many subcategories. The total error in a simulation is controlled by the aggregate of each relevant source of error. In this paper, “unnecessary work” — or, more precisely, over-computation — is any machine expense used to drive one source of error in a problem far below the total error. It cannot be overstated that removing sources of over-computation can have an outsized influence on the computational cost of getting an accurate solution.

In some instances, circumventing over-computation is the simplest way to accelerate a numerical algorithm. For example, in the use of iterative methods, for both linear and non-linear problems, it has long been acknowledged that over-solving a discretized problem is a negligent expense. Relaxing iterative solver errors usually reduces to just adjusting the tolerances naturally built into established algorithms. In other instances, sources of over-computation are less conspicuous and avoiding them requires the development of new algorithms. For example, in the field of uncertainty quantification, it has recently come to light that sampling error can be relaxed — and, in turn, computational cost can be significantly reduced — by the use of a tunable surrogate response surface [10, 36].

The focus of this article is the Galerkin form of isogeometric analysis (IGA) [28, 12]. At first sight, in view of the long list of computer methods which rose beforehand, the Galerkin isogeometric method may be seen as a rather paradigmatic approach to the discretization of partial differential equations (PDEs). Indeed, Galerkin IGA methods are little more than finite element methods which employ non-uniform rational B-spline (NURBS) bases [27]. Although it was immediately shown by Hughes et al. [28] that the use of such a basis improves the interoperability between computer-aided design (CAD) and PDE analysis, many other benefits of the IGA approach were also demonstrated early on in the IGA literature. Of particular note, the arbitrary smoothness of NURBS bases generally improves the accuracy per degree of freedom and lends itself to convenient techniques for the discretization of high-order PDEs [30, 27]. It is these and other serendipitous features of IGA which have attributing to its truly meteoric success in modern computational science and engineering research.

It is well-established that traditional isogeometric methods face a great computational burden at the point of matrix assembly. This is due, in part, to the large support of the basis functions. Although many other common concerns are naturally alleviated by the IGA paradigm, this particular challenge is clearly evidenced by the expansive literature on quadrature rules and accelerated assembly algorithms [24, 8, 38, 35, 26, 2, 34, 33, 11, 23, 29, 3, 25]. Indeed, we may further accentuate this remark with the following quote from the 2014 review article [13]:

…at the moment the assembly of the matrix is the most time-consuming part of isogeometric codes. The development of optimal assembly procedures is an important task required to render isogeometric methods a competitive technology.

In this article, we present a simple methodology to avoid over-assembling matrices in IGA. Roughly speaking, it requires performing quadrature for only a small fraction of the trial and test basis function interactions and then approximating the rest through, for example, interpolation. This leads to a large sparse matrix where the majority of entries have not been computed using any quadrature at all. Usually, such matrices will not coincide with the ones generated by performing quadrature for every non-zero entry (cf. Section 5.4), but they can be interpreted as surrogates for those matrices.

The main idea used here was first introduced in the context of first-order finite elements by Bauer et al. in [6]. Thereafter, applications to peta-scale geodynamical simulations were presented in [4, 5] and a theoretical analysis was given in [21]. In the massively parallel applications [6, 4, 5], it was natural to work with so-called “macro-meshes” as well as a piecewise polynomial space for resolving the surrogate matrices. This choice was motivated by a low communication cost across the faces of the macro-elements, a convenient cache-aware implementation, and the fact that a hybrid mesh structure allowed for extremely fast evaluation of the three-dimensional polynomials; see [4, 5] for further details. In contrast, the surrogate matrices in this paper are computed using a B-spline interpolation space. With this particular strategy, we demonstrate that the cost of matrix assembly in conventional IGA codes can be reduced by an order of magnitude.

Our approach bears some similarities to the integration by interpolation and lookup (IIL) approach proposed in [33, 34]. In those two works, an integrand factor from the weak form, composed of both the coefficients of the underlying PDE as well as the geometry mapping, is approximated. In this work, the actual entries of the final matrix are shown to be related to a small number of smooth so-called stencil functions; instead of a factor in the integrand, it is these stencil functions which are approximated.

An advantage of the IIL approach is that, in theory, it does not require a uniform knot vector assumption (cf. Section 2). However, in practice, this assumption is necessary in order to obtain compact lookup tables [34]. On the other hand, one advantage of our approach is that it can be easily implemented using existing IGA assembly paradigms. Another advantage is that the implementation is identical whether using a B-spline or a NURBS basis.

Before moving on, some other important remarks deserve to be emphasized:

•

The methodology we propose for IGA applications is essentially independent of the quadrature rule used at the individual element or basis function level. This lays bare the possibility for it to be used in conjunction with many other cutting edge techniques for accelerated IGA assembly.

•

For our new surrogate methods, it would be most efficient if the matrix entries which require quadrature were to be computed based on individual basis function interactions. This leaves out standard element-by-element assembly strategies, but would work well with the control point pairwise method proposed in [32] or, ideally, with a row-by-row approach; e.g., [25, 11, 38]. Nevertheless, one should still expect to see significant speed-ups with surrogate methods in standard element-by-element codes, at least for moderate polynomial orders. In order to underscore this fact, we did not develop a stand-alone code. Instead, we implemented our surrogate methods by simply modifying the assembly routines in the open-source library GeoPDEs [20, 42], leaving ever other aspect of the code fixed. For illustration, the reader may refer to the left and right sides of Fig. 1 to compare the relative timings before and after some relatively minor changes were made to this software (cf. Section 6 and [22]). In both cases, the differences in solving time and solution accuracy were negligible. We expect that most other element-by-element IGA codes should be easy to modify in a similar manner.

•

Many efficient assembly strategies for IGA see their performance advantage only in the high polynomial order regime. Here, the performance usually grows with each $h$ -refinement. Indeed, at just over one million degrees of freedom, our experiments demonstrate assembly speed-ups beyond fifty times, in the exact same code, with a simple second-order NURBS basis (see Section 7.3.3).

In our experiments, we analyze surrogate IGA methods for Poisson’s equation, membrane vibration, plate bending, and Stokes’ flow problems. The Poisson case is analyzed in detail and the additional experiments are provided in order to motivate further study. It is our eventual goal to adapt our methods to a matrix-free framework, similar to what has been used recently in low-order settings [21, 6, 4, 5]. This would certainly be helpful in order to reach the full potential of IGA in extreme scale computations.

In the next section, we take stock of the majority of mathematical notation used in the remainder of the paper. In Section 3, we introduce the notion of a stencil function in the IGA context. In Section 4, we investigate the accuracy of B-spline interpolation with regard to stencil functions. In Section 5, we use interpolants of the stencil functions (i.e., surrogate stencil functions) to define surrogate matrices for IGA. Section 6 consists of a brief description of our software implementation. A more complete description is provided in [22]. In Sections 7, 8, 9 and 10, we examine surrogate IGA methods for Poisson’s equation, membrane vibration, plate bending, and Stokes’ flow problems, respectively. Appendix A is included to support some of the analysis carried out in Section 4.

2 Preliminaries

In this section, we lay out the principal mathematical focus and notation of the paper.

2.1 Model problems and notation

Let $\Omega\operatorname{\subseteq}\mathbb{R}^{n}$ be a domain, $n=2,3$ . Let $V=V(\Omega)$ be a Hilbert space over $\mathbb{R}$ , the field of real numbers, and let $V^{\ast}$ denote its topological dual. For historical reasons, we proceed by adopting notation from the $h$ -version of the finite element method and thus let $V_{h}$ denote a finite-dimensional subspace of $V$ . Although we also deal with a number of important alternatives (see, e.g., Sections 8 and 10), we are chiefly interested in the following three problems:

[TABLE]

Here and throughout, $a:V\times V\to\mathbb{R}$ is a continuous and coercive bilinear form, $\widetilde{a}:V_{h}\times V_{h}\to\mathbb{R}$ is an approximation of $a|_{V_{h}\times V_{h}}$ , hereby deemed the surrogate for $a(\cdot,\cdot)$ , and $F\in V^{\ast}$ is a bounded linear functional.

To keep the exposition simple and to the point, we will assume that the physical domain of every problem $\Omega\operatorname{\subseteq}\mathbb{R}^{n}$ is defined as the image of a single parametric domain $\widehat{\Omega}=(0,1)^{n}$ . This leaves all of our analysis in the single patch geometry setting, $\Omega=\bm{\varphi}(\widehat{\Omega})$ , for some diffeomorphism $\bm{\varphi}:\widehat{\Omega}\to\mathbb{R}^{n}$ of sufficient regularity; see Figure 2. The single patch setting is by no means a necessary assumption. The entirety of the analysis considered here can easily be generalized to the multi-patch setting (cf. Section 3.5). However, in order to stay in the isogeometric setting, we assume that $\bm{\varphi}(\widehat{\bm{x}})=\sum_{i}\mathsf{c}_{i}\widehat{N}_{i}(\widehat{\bm{x}})$ , where each $\mathsf{c}_{i}\in\mathbb{R}^{n}$ is a control point vector and each $\widehat{N}_{i}(\widehat{\bm{x}})$ is a NURBS basis function on the parametric domain $\widehat{\Omega}$ . Here, NURBS basis functions are defined in the standard way, as described in Section 2.2.

For matrices $\mathsf{M}\in\mathbb{R}^{l\times m}$ , define the ${\operatorname*{\vphantom{p}max}}$ -norm, $\|\mathsf{M}\|_{\operatorname*{\vphantom{p}max}}=\operatorname*{\vphantom{p}max}_{i,j}|\mathsf{M}_{ij}|$ . For any function $v:\Omega\to\mathbb{R}$ , we will use the notation, $\|v\|_{0}$ , $\|v\|_{1}$ , and $\|v\|_{2}$ , for the canonical $L^{2}(\Omega)$ -, $H^{1}(\Omega)$ -, and $H^{2}(\Omega)$ -norms, respectively.

Moreover, if $v$ is smooth, we define its support as $\operatorname*{supp}(v)=\{\bm{x}\in\Omega\,:\,v(\bm{x})\neq 0\}$ .

When dealing with a domain $\mathcal{D}\operatorname{\subseteq}\Omega$ , denote the related $L^{2}(\mathcal{D})$ , $H^{1}(\mathcal{D})$ , and $H^{2}(\mathcal{D})$ norms by $\|v\|_{0,\mathcal{D}}$ , $\|v\|_{1,\mathcal{D}}$ , and $\|v\|_{2,\mathcal{D}}$ , respectively. Denote the space of univariate polynomials of degree at most $q$ by $\mathcal{P}_{q}$ . Likewise, denote the space of multivariate polynomials of degree at most $q$ , in each Cartesian direction $\bm{e}_{i}$ , by $\mathcal{Q}_{q}=[\mathcal{P}_{q}]^{n}$ and denote $\mathcal{Q}_{q}(\mathcal{D})=\{f|_{\mathcal{D}}\,:\,f\in\mathcal{Q}_{q}\}$ . We will often deal with Cartesian subdomains $\mathcal{D}=\mathcal{D}_{1}\times\cdots\times\mathcal{D}_{n}$ . In this case, it is natural to deal with Cartesian–Sobolev seminorms; e.g., $[f]_{W^{r,\infty}(\mathcal{D})}=\sum_{i=1}^{n}\|D^{r\cdot\bm{e}_{i}}f\|_{L^{\infty}(\mathcal{D})}$ . Note that $[f]_{W^{r,\infty}(\mathcal{D})}\leq|f|_{W^{r,\infty}(\mathcal{D})}=\sum_{|\bm{\alpha}|=r}\|D^{{\bm{\alpha}}}f\|_{L^{\infty}(\mathcal{D})}$ . All remaining notation will be defined as it arises.

2.2 Cardinal B-splines and NURBS

Let $m\geq 2p+1$ and $\{b_{k}\}_{k=1}^{m}$ be an order $p$ B-spline basis on the unit interval $(0,1)$ . Let $N=m^{n}$ and let $\{\widehat{N}_{i}\}_{i=1}^{N}$ be a corresponding NURBS basis on $\widehat{\Omega}$ . Namely,

[TABLE]

where each $\widehat{B}_{i}(\widehat{\bm{x}})={b}_{i_{1}}(\widehat{x}_{1})\cdots{b}_{i_{n}}(\widehat{x}_{n})$ is a multivariate B-spline of uniform order $p$ and each $w_{i}>0$ is a fixed weight parameter. Here and from now on, we identify every global index $1\leq i\leq N$ with a multi-index $\bm{i}=(i_{1},\ldots,i_{n})$ , $1\leq i_{k}\leq m$ , through the colexicographical relationship $i=i_{1}+(i_{2}-1)m+\cdots+(i_{n}-1)m^{n-1}$ .

Generally, a univariate B-spline basis $\{{b}_{k}\}_{k=1}^{m}$ is defined by an ordered multiset, or knot vector, $\Xi=\{\xi_{1},\ldots,\xi_{m+p+1}\}$ . In this paper, we deal only with open uniform knot vectors; i.e., $\xi_{1},\ldots,\xi_{p+1}=0$ , $\xi_{m+1},\ldots,\xi_{m+p+1}=1$ , and $\xi_{k+1}-\xi_{k}=\frac{1}{m-p}$ , otherwise. The quantity $h=\operatorname*{\vphantom{p}max}_{1\leq k\leq m-1}|\xi_{k+1}-\xi_{k}|=\frac{1}{m-p}$ will be an important parameter for us, which we hereby refer to as the mesh size. Clearly, we could consider NURBS spaces with different orders $p_{1},\ldots,p_{n}$ in each Cartesian direction [37, 28]. In order to simplify the exposition, we avoid this complication.

For an explicit example of an open uniform knot vector, consider

[TABLE]

The corresponding $p=2$ B-spline basis is depicted in Fig. 3. Observe that all but four of the basis functions (highlighted in red) are identical, up to an equally spaced set of translations. These functions are called cardinal B-splines [41, 39, 40].

Let $\widetilde{x}^{(k)}=(k-\frac{p+1}{2})\cdot h$ , for each ${k}=p+1,\ldots,m-p$ . In general, there are always $m-2p$ univariate cardinal B-spline basis functions which can each be expressed ${b}_{k}(\widehat{x})={b}(\widehat{x}-\widetilde{x}^{(k)})$ , for some function, ${b}(\widehat{x})$ , centered at the origin (see, e.g., Fig. 3).

Likewise, there are $(m-2p)^{n}$ multivariate cardinal B-splines. That is $\widehat{B}_{i}(\widehat{\bm{x}})=\widehat{B}(\widehat{\bm{x}}-\widetilde{\bm{x}}_{i})$ , where $\widetilde{\bm{x}}_{i}=\big{(}\widetilde{x}^{(i_{1})},\ldots,\widetilde{x}^{(i_{n})})$ and $\widehat{B}(\widehat{\bm{x}})=b({\widehat{x}_{1}})\cdots b({\widehat{x}_{n}})$ . For future reference, define the set of all such $\widetilde{\bm{x}}_{i}$ as $\widetilde{\mathbb{X}}$ . Also, notice that the ratio of cardinal B-splines basis functions to total B-spline basis functions quickly tends to unity, $\big{(}\frac{m-2p}{m}\big{)}^{n}\to 1$ , as $m$ increases.

3 Surrogate matrices: Exploiting basis structure

Equations 1b and 1c, respectively, induce two related matrix equations,

[TABLE]

for basis function coefficients $\mathsf{u},\widetilde{\mathsf{u}}\in\mathbb{R}^{N}$ . As mentioned previously, the key idea in this paper is constructing the majority of the surrogate stiffness matrix $\widetilde{\mathsf{A}}$ via interpolation of the true stiffness matrix $\mathsf{A}$ . In this section, we first describe exactly what is meant by this statement and then demonstrate how isogeometric analysis makes it possible.

3.1 Stencil functions

Recall 1b. Generally, every function $v_{h}\in V_{h}$ can be identified with a unique function on the domain $\widehat{\Omega}$ through a suitable pushforward operator $\bm{\varphi}_{\ast}$ . Namely, $v_{h}=\bm{\varphi}_{\ast}\widehat{v}_{h}$ . Define $\widehat{V}_{h}$ be the set of all such $\widehat{v}_{h}$ , which is a discrete space in the parametric domain $\widehat{\Omega}$ . Accordingly, the bilinear form $a:V_{h}\times V_{h}\to\mathbb{R}$ can be identified with a parametric domain bilinear form $\widehat{a}:\widehat{V}_{h}\times\widehat{V}_{h}\to\mathbb{R}$ in such a way that $a(w_{h},v_{h})=\widehat{a}(\widehat{w}_{h},\widehat{v}_{h})$ , for all $\widehat{w}_{h},\widehat{v}_{h}\in\widehat{V}_{h}$ .

Let $\{\phi_{i}\}=\{\bm{\varphi}_{\ast}\widehat{\phi}_{i}\}$ be a basis for $V_{h}$ with $\{\widehat{\phi}_{i}\}$ the corresponding basis for $\widehat{V}_{h}$ . The fundamental observation in the surrogate matrix methodology now follows. If, $\widehat{\phi}$ is some fixed reference function and, for a set of indices $i,j$ , $\widehat{\phi}_{i}(\widehat{\bm{x}})=\widehat{\phi}(\widehat{\bm{x}}-\widetilde{\bm{x}}_{i})$ and $\widehat{\phi}_{j}(\widehat{\bm{x}})=\widehat{\phi}(\widehat{\bm{x}}-\widetilde{\bm{x}}_{j})$ , then

[TABLE]

Here, in the rightmost equality, the definition of a new scalar-valued function $\Phi(\cdot,\cdot)$ has been made, wherein any dependence on the mesh size $h$ has been implicitly assumed. This function, $\Phi(\widetilde{\bm{x}}_{j},\widetilde{\bm{x}}_{i})$ , may also be expressed in terms of $\widetilde{\bm{x}}_{i}$ and a translation ${\bm{\delta}}=\widetilde{\bm{x}}_{j}-\widetilde{\bm{x}}_{i}$ . In this alternative characterization, after denoting $\mathsf{A}_{ij}=a(\phi_{j},\phi_{i})$ , we may write

[TABLE]

and ${\bm{\delta}}$ may be treated as a parameter. $\Phi_{\bm{\delta}}(\widetilde{\bm{x}}_{i})=\Phi(\widetilde{\bm{x}}_{i}+{\bm{\delta}},\widetilde{\bm{x}}_{i})=\Phi(\widetilde{\bm{x}}_{j},\widetilde{\bm{x}}_{i})$ .

For a fixed number of translations ${\bm{\delta}}$ , these so-called stencil functions, $\Phi_{\bm{\delta}}(\cdot)$ , can be identified with the majority of entries in many IGA stiffness matrices. In many circumstance, each $\Phi_{\bm{\delta}}(\cdot)$ is smooth and may, therefore, be interpolated after only being evaluated at small number of points in the parametric domain $\widetilde{\bm{x}}_{i}\in{\widehat{\Omega}}$ . After denoting the interpolants — i.e., the surrogate stencil functions — by $\widetilde{\Phi}_{\bm{\delta}}(\cdot)$ , we simply define

[TABLE]

Remark 3.1.

*In some cases, the stencil functions $\Phi_{\bm{\delta}}$ are themselves polynomials (see, e.g., Propositions 5.2, 8.1 and 10.1). Therefore, if polynomial interpolation of sufficiently high order is used, the true stiffness matrix be generated exactly — i.e., $\widetilde{\Phi}_{\bm{\delta}}=\Phi_{\bm{\delta}}$ and thus $\widetilde{\mathsf{A}}=\mathsf{A}$ , up to round-off error — in significantly less time than with a traditional assembly algorithm. Otherwise, in many scenarios, a sufficiently accurate approximation of the stiffness matrix $\widetilde{\mathsf{A}}\approx\mathsf{A}$ will be generated. *

3.2 B-spline basis functions

Fix $V=H^{1}(\Omega)$ , $\bm{\varphi}:\widehat{\Omega}\to\Omega$ , $\widehat{V}_{h}=\mathrm{span}\{\widehat{B}_{i}\}$ , and, accordingly, $V_{h}=\mathrm{span}\{B_{i}\}$ , where each $B_{i}=\widehat{B}_{i}\circ\bm{\varphi}^{-1}$ . Consider the bilinear form $a(u,v)=\int_{\Omega}\nabla u\cdot\nabla v\,\mathrm{d}x$ . It is easy to verify that $a(\cdot,\cdot)$ pulls back to

[TABLE]

with arguments $\widehat{w},\widehat{v}\in\widehat{V}=H^{1}(\widehat{\Omega})$ .

Recall Section 2.2. Assume that $\{\widehat{B}_{i}\}$ is generated by an open uniform knot vector $\Xi$ with $p>0$ fixed. Obviously, $\mathsf{A}_{ij}=\widehat{a}(\widehat{B}_{j},\widehat{B}_{i})$ . In the cardinal B-spline setting, $\widehat{B}_{i}(\widehat{\bm{x}})=\widehat{B}(\widehat{\bm{x}}-\widetilde{\bm{x}}_{i})$ and $\widehat{B}_{j}(\widehat{\bm{x}})=\widehat{B}(\widehat{\bm{x}}-\widetilde{\bm{x}}_{j})$ . Therefore, by a simple change of variables,

[TABLE]

where ${\bm{\delta}}=\widetilde{\bm{x}}_{j}-\widetilde{\bm{x}}_{i}$ , $\widehat{B}_{\bm{\delta}}(\widehat{\bm{y}})=\widehat{B}(\widehat{\bm{y}}-{\bm{\delta}})$ , and $\widehat{\omega}_{\bm{\delta}}=\operatorname*{supp}(\widehat{B})\cap\operatorname*{supp}(\widehat{B}_{{\bm{\delta}}})$ .

There is a natural correspondence between the density of the matrix $\mathsf{A}$ and the set of translations ${\bm{\delta}}$ such that $\widehat{\omega}_{\bm{\delta}}\neq\emptyset$ . Consequently, the cardinality of the set of relevant translations, $\mathscr{D}=\{{\bm{\delta}}=\widetilde{\bm{x}}_{j}-\widetilde{\bm{x}}_{i}\,:\,\widehat{\omega}_{\bm{\delta}}\neq\emptyset\}$ , is fixed for all sufficiently large $m$ . Namely, $|\mathscr{D}|=(2p+1)^{n}$ . Now, for each ${\bm{\delta}}\in\mathscr{D}$ , we may define the stencil function

[TABLE]

Let $\mathrm{conv}(\widetilde{\mathbb{X}})$ denote the convex hull of $\widetilde{\mathbb{X}}$ . Such functions are defined at any point $\widetilde{\bm{x}}\in\mathrm{conv}(\widetilde{\mathbb{X}})$ where $\widetilde{\bm{x}}+{\bm{\delta}}\in\mathrm{conv}(\widetilde{\mathbb{X}})$ . Ultimately, this means that the domain of $\Phi_{\bm{\delta}}$ , which we will denote $\widetilde{\Omega}_{\bm{\delta}}$ , depends on ${\bm{\delta}}$ (see, e.g., Figure 4). Clearly, we always have $\bm{0}\in\mathscr{D}$ . The reader may easily verify that $\widetilde{\Omega}_{\bm{0}}=\mathrm{conv}(\widetilde{\mathbb{X}})=\bigcup_{{\bm{\delta}}\in\mathscr{D}}{\widetilde{\Omega}_{\bm{\delta}}}$ and $\widetilde{\Omega}_{\bm{\delta}}+{\bm{\delta}}\operatorname{\subseteq}\widetilde{\Omega}_{\bm{0}}$ , for each ${\bm{\delta}}\in\mathscr{D}$ .

3.3 NURBS basis functions

The principal difference between the treatment of a NURBS basis $\{\widehat{N}_{i}\}$ and the related B-spline basis $\{\widehat{B}_{i}\}$ is that a NURBS basis cannot be assumed to have the translation invariance property which leads directly to 5. Fortunately, as we now demonstrate, this property is not entirely necessary to define a useful stencil function.

Define $W(\widehat{\bm{x}})=\sum_{j}w_{j}\widehat{B}_{j}(\widehat{\bm{x}})$ , where $\{w_{j}\}$ are the weight parameters appearing in 2. It is known that $W(\widehat{\bm{x}})$ is unchanged under mesh refinements. Therefore, employing a similar change of variables argument as used in 9, it holds that

[TABLE]

where, as in 9, ${\bm{\delta}}=\widetilde{\bm{x}}_{j}-\widetilde{\bm{x}}_{i}$ . At this point, it may be natural to divide by $w_{i}w_{j}$ and define the stencil function $\Phi_{\bm{\delta}}(\widetilde{\bm{x}})$ from the resulting expression on the right-hand side of 11. Instead, we pause to consider the regularity of $W(\widehat{\bm{x}})$ .

Recall 7. Since each $\widehat{B}_{i}(\widehat{\bm{x}})$ is only piecewise polynomial, it is clear that, in general, $W\not\in C^{q}(\widehat{\Omega})$ , for any $q\geq p$ . This fact could significantly limit the accuracy of an interpolant $\widetilde{\Phi}_{\bm{\delta}}=\Pi\Phi_{\bm{\delta}}$ which we may wish to construct. Therefore, we restrict our attention to $W\in C^{q}(\widehat{\Omega})$ , where $q\geq p$ . It turns out that this set of functions, $\mathrm{span}\{\widehat{B}_{i}\}\cap C^{q}(\widehat{\Omega})$ , is equal to the polynomial space $\mathcal{Q}_{p}(\widehat{\Omega})$ . Moreover, restricting to the subset $\widetilde{\Omega}_{\bm{0}}$ , each weight parameter can be expressed $w_{i}=w(\widetilde{\bm{x}}_{i})$ , where $w(\widehat{\bm{x}})$ is a polynomial in $\mathcal{Q}_{p}(\widetilde{\Omega}_{\bm{0}})$ . (See Appendix A for details.) Therefore, for any $W\in\mathcal{Q}_{p}(\widehat{\Omega})$ , we may define

[TABLE]

for each $\widetilde{\bm{x}}\in\widetilde{\Omega}_{\bm{\delta}}$ and ${\bm{\delta}}\in\mathscr{D}$ . Clearly, $\mathsf{A}_{ij}=\Phi_{\bm{\delta}}(\widetilde{\bm{x}}_{i})$ for each corresponding ${\bm{\delta}}\in\mathscr{D}$ . An illustration of a stencil function coming from an IGA discretization with a NURBS basis is presented in Figure 5.

Remark 3.2.

*Notably, when $W=1$ , then $w=1$ also. Therefore, 12 is consistent with 10. Obviously, in practice, neither of these expressions needs to be used in order to evaluate $\Phi_{\bm{\delta}}(\cdot)$ at any point $\widetilde{\bm{x}}_{i}$ . Indeed, since $\Phi_{\bm{\delta}}(\widetilde{\bm{x}}_{i})=\mathsf{A}_{ij}$ , any existing IGA code already has a mechanism to compute $\Phi_{\bm{\delta}}(\widetilde{\bm{x}}_{i})$ using quadrature (cf. Section 6). Nevertheless, these expressions are important for analysis. *

3.4 Symmetric bilinear forms

When $a(w,v)=a(v,w)$ , for all $w,v\in V$ , a translational symmetry is induced on the set of corresponding stencil functions. Indeed, it is simple to see that $\Phi(\widetilde{\bm{x}}_{j},\widetilde{\bm{x}}_{i})=\Phi(\widetilde{\bm{x}}_{i},\widetilde{\bm{x}}_{j})$ and, therefore,

[TABLE]

A similar conclusion can be drawn in the NURBS scenario above. Figure 6 presents a visual comparison of two stencil functions, $\Phi_{\bm{\delta}}$ and $\Phi_{-{\bm{\delta}}}$ , generated by an isogeometric NURBS basis and the corresponding symmetric bilinear form 8.

3.5 The multi-patch setting

In the multi-patch setting, the physical domain $\Omega$ is partitioned into a finite number of disjoint subdomains $\overline{\Omega}=\bigcup_{k=1}^{L}\overline{\Omega^{(k)}}$ . We may assume that each such domain or patch $\Omega^{(k)}$ , as they are usually called, can be identified with a common parametric domain $\widehat{\Omega}$ , through a unique NURBS mapping $\Omega^{(k)}=\bm{\varphi}^{(k)}(\widehat{\Omega})$ . In this case, one may define a separate set of stencil functions $\Phi_{\bm{\delta}}^{(k)}(\widetilde{\bm{x}}):\widetilde{\Omega}_{\bm{\delta}}\to\mathbb{R}$ , for each index $k$ . The forthcoming analysis is immediately applicable to this setting, but treating it outright would require unnecessarily complicated notation. We also did not consider this setting in our numerical experiments.

3.6 Edge-based and face-based stencil functions

It is also possible to define stencil functions, in addition to those above, which exploit lower-dimensional translational symmetries. For instance, many functions in the basis $\{\widehat{N}_{i}\}$ are only equivalent under translations parallel to a given edge in the parametric domain $\widehat{\Omega}$ . Likewise, a new family of stencil functions could be constructed for each edge or face in a patch $\Omega^{(k)}$ . We do not pursue this observation here.

4 Surrogate matrices: Interpolation of stencil functions

Recall 6 and 7. A surrogate matrix $\widetilde{\mathsf{A}}\approx\mathsf{A}$ will be useful to us if: (1) each stencil function $\Phi_{\bm{\delta}}$ has high regularity and, therefore, the high-order approximation $\Pi{\Phi}_{\bm{\delta}}\approx\Phi_{\bm{\delta}}$ will have high-order accuracy; and (2) each $\widetilde{\Phi}_{\bm{\delta}}=\Pi{\Phi}_{\bm{\delta}}$ can be computed and evaluated fast. In this section, we focus on the former of these two requirements. Particulars on our implementation are withheld until Section 6.

As a simplifying accommodation, we perform all of our analysis on the largest subset of $\widehat{\Omega}$ where every stencil function is defined. That is, $\widetilde{\Omega}=\bigcap_{{\bm{\delta}}\in\mathscr{D}}{\widetilde{\Omega}_{\bm{\delta}}}$ . A simple computation shows that

[TABLE]

All of the results in this section can be reformulated with $\widetilde{\Omega}$ replaced by $\widetilde{\Omega}_{\bm{\delta}}$ . However, this generalization also requires a different operator $\Pi$ for each ${\bm{\delta}}\in\mathscr{D}$ ; see, e.g., [21].

4.1 B-spline interpolation

Let $\{\widetilde{B}_{j}\}$ be a degree $q\geq 0$ multivariate B-spline basis on $\widetilde{\Omega}$ with the quasi-uniform knot vector $\widetilde{\bm{\Xi}}=\widetilde{\Xi}_{1}\times\cdots\times\widetilde{\Xi}_{n}$ and define $S_{q}(\widetilde{\bm{\Xi}})=\mathrm{span}\{\widetilde{B}_{j}\}$ . We will refer to each $\widetilde{\bm{\xi}}_{\bm{j}}\in\widetilde{\bm{\Xi}}$ , $\bm{j}=(j_{1},\cdots,j_{n})$ , as a sampling point and define a new length scale parameter, hereby referred to as the sampling length, $H=\operatorname*{\vphantom{p}max}_{|\bm{j}|=1,\bm{i}}\big{\{}\|\widetilde{\bm{\xi}}_{\bm{i}+\bm{j}}-\widetilde{\bm{\xi}}_{\bm{i}}\|_{\mathrm{max}}\,:\,\widetilde{\bm{\xi}}_{\bm{i}+\bm{j}}\in\widetilde{\bm{\Xi}}\big{\}}$ .

In isogeometric analysis, the geometry determines the basis used in PDE discretization. When constructing the surrogate stencil functions $\widetilde{\Phi}_{\bm{\delta}}$ , we are primarily interested in using a basis of higher order $q$ than the underlying spatial discretization $p$ . Generally, such a choice $q>p$ is desirable because it will allow us to guarantee that the discretization error in a standard IGA method will dominate the error actually attributed to using a surrogate (cf. Section 7.3.2).

In this paper, we only consider constructing B-spline interpolants $\widetilde{\Phi}_{\bm{\delta}}=\Pi_{H}\Phi_{\bm{\delta}}$ , where $\Pi_{H}$ is a stable local interpolation operator onto the space $S_{q}(\widetilde{\bm{\Xi}})$ . Various global interpolants could also be considered, as well as sparse grid interpolants [9] and least-squares projections (cf. [21]). We see no benefit in using a NURBS basis to approximate $\Phi_{{\bm{\delta}}}$ , even when a NURBS mapping $\bm{\varphi}:\widehat{\Omega}\to\Omega$ defines the physical domain. The following lemma follows directly from [14, Theorem 4.2].

Lemma 4.1.

For every bounded projection $\Pi_{H}\colon C^{0}({\widetilde{\Omega}})\to S_{q}(\widetilde{\bm{\Xi}})$ , with $\|\Pi_{H}\|<C_{0}$ for some $H$ -independent constant $C_{0}$ , it holds that

[TABLE]

*where $C_{1}$ is a constant depending only on $q$ , $\widetilde{\Omega}$ , and $\|\Pi_{H}\|$ . *

Remark 4.2.

*The norm $\|\Pi_{H}\|$ can be greatly influenced by the distribution of the sampling points $\widetilde{\bm{\xi}}_{\bm{j}}\in\widetilde{\bm{\Xi}}$ . In all of our experiments, we kept $\widetilde{\bm{\Xi}}\operatorname{\subseteq}\widetilde{\Omega}\cap\widetilde{\mathbb{X}}$ . This convenient choice delivered good results. When we wish to underscore the convention $\widetilde{\bm{\Xi}}\operatorname{\subseteq}\widetilde{\Omega}\cap\widetilde{\mathbb{X}}$ , we will denote the sampling points in $\widetilde{\bm{\Xi}}$ by $\widetilde{\bm{x}}_{i}^{\mathrm{s}}$ . Moreover, from now on, dependence on the subset $\widetilde{\Omega}$ will not be stated since $\widetilde{\Omega}\to\widehat{\Omega}$ as $h\to 0$ and the parametric domain $\widehat{\Omega}=(0,1)^{n}$ is always fixed. *

4.2 Regularity of the stencil functions

Define $\widetilde{\Phi}_{\bm{\delta}}=\Pi_{H}\Phi_{\bm{\delta}}$ , where $\Pi_{H}$ is any projection satisfying the assumptions of Lemma 4.1. In this subsection, we present an essential theorem on the error in the class of surrogate stencil functions defined in 12. In order to expedite our presentation, we only prove Theorem 4.3 here under an assumption which directly relates to the B-spline basis scenario 10. The general proof is given in Appendix A.

Theorem 4.3.

Let $\Phi_{\bm{\delta}}:\widetilde{\Omega}\to\mathbb{R}$ be defined by 12 and assume that $\bm{\varphi}:\widehat{\Omega}\to\Omega$ induces a coefficient tensor ${K}\in\big{[}W^{q+1,\infty}(\widehat{\Omega})\big{]}^{n\times n}$ . If $W\in\mathcal{Q}_{p}(\widehat{\Omega})$ , then there exists a constant $C_{2}$ , depending only on $p$ , $q$ , $\|\Pi_{H}\|$ , and $\bm{\varphi}$ , such that

[TABLE]

*Moreover, if $W(\widehat{\bm{x}})=1$ , then $C_{2}\leq C\,C_{1}\,[K]_{W^{q+1,\infty}(\widehat{\Omega})}$ , for some $C$ depending only on $p$ . *

Proof 4.4 (Proof of Theorem 4.3, under the assumption $W(\widehat{\bm{x}})=1$ ).

For every multi-index , it holds that

[TABLE]

*Therefore, $\|D^{{\bm{\alpha}}}\Phi_{\bm{\delta}}\|_{L^{\infty}(\widetilde{\Omega})}\leq\|D^{{\bm{\alpha}}}{K}\|_{L^{\infty}(\widehat{\Omega})}\|\widehat{\nabla}\widehat{B}\cdot\widehat{\nabla}\widehat{B}_{\bm{\delta}}\|_{L^{1}(\widehat{\omega}_{\bm{\delta}})}\leq Ch^{n-2}\|D^{{\bm{\alpha}}}{K}\|_{L^{\infty}(\widehat{\Omega})}$

Remark 4.5.

*Observe that $[K]_{W^{q+1,\infty}(\widehat{\Omega})}=0$ iff $K\in[\mathcal{Q}_{q}(\widehat{\Omega})]^{n\times n}$ . Generally, due to the definition of $K$ appearing in 8, this assumption can only be expected to be satisfied when the geometry map $\bm{\varphi}$ is affine. Nevertheless, if $W=1$ as well, then Theorem 4.3 would imply that $\widetilde{\Phi}_{\bm{\delta}}=\Phi_{\bm{\delta}}\in\mathcal{Q}_{q}(\widetilde{\Omega})$ . This a useful reproduction property which can help to verify the implementation. It also manifests in stencil functions defined from more complicated bilinear forms than 8. The reproduction of stencil functions is described in a general scenario in Section 5.4. *

5 Surrogate matrices: Preserving structure

In this section, we discuss a number of different surrogate matrix definitions. Under both definitions, we also show that the surrogate matrices may actually fully reproduce the very matrices they approximate. Note that these definitions may also be employed for general matrices and not only for a stiffness matrix. If not stated otherwise, in the following subsections we assume that the matrix $\mathsf{A}$ emanates from a discretization of a general bilinear form.

5.1 General surrogate matrices

In Section 3.1, the definition $\widetilde{\mathsf{A}}_{ij}=\widetilde{\Phi}_{\bm{\delta}}(\widetilde{\bm{x}}_{i})$ was presented, with ${\bm{\delta}}=\widetilde{\bm{x}}_{j}-\widetilde{\bm{x}}_{i}$ . However, this is valid exclusively for the special indices $1\leq i,j\leq N$ with a cardinal B-spline associated to them. Of course, we may always compute the remaining components of the surrogate matrix $\widetilde{\mathsf{A}}$ directly, using element-wise quadrature, as done in standard IGA assembly algorithms. Therefore, we propose the following definition:

[TABLE]

5.2 Symmetry

If the bilinear form is symmetric, 18a does not guarantee that the corresponding surrogate stiffness matrix $\widetilde{\mathsf{A}}$ will be symmetric. In order to enforce symmetry, it is convenient to just include the action of copying $\mathsf{A}_{ij}$ into $\mathsf{A}_{ji}$ , for all $i>j$ . Therefore, we propose the following symmetric surrogate matrix definition:

[TABLE]

With the definition, note that only $\frac{(2p+1)^{n}+1}{2}$ surrogate stencil functions need to be computed. We employ this definition in the construction of surrogate mass matrices $\mathsf{M}$ which are symmetric but do not have a kernel (cf. Section 8).

5.3 Preserving the kernel

Recall that $\sum N_{i}(\bm{x})=1$ . In the situation $a(u,v)=\int_{\Omega}\nabla u\cdot\nabla v\,\mathrm{d}x$ , we have that $a(1,w)=a(w,1)=0$ , for all $w\in H^{1}(\Omega)$ . Therefore, $1\in V_{h}=\mathrm{span}\{N_{i}\}$ and, moreover, $\mathsf{A}\mathsf{v}_{1}=0$ , where $\mathsf{v}_{1}=(1,\ldots,1)^{\top}$ . Clearly, neither definition 18a nor definition 18b, will guarantee that $\widetilde{\mathsf{A}}\mathsf{v}_{1}=0$ . Therefore, we pose the following symmetric kernel-preserving definition:

[TABLE]

With this definition, the reader may readily verify that $\widetilde{\mathsf{A}}\mathsf{v}_{1}=0$ and $\widetilde{\mathsf{A}}=\widetilde{\mathsf{A}}^{\top}$ .

Remark 5.1.

*Two important comments are in order. First, in a matrix-free setting, where memory copying cannot be performed efficiently, one can actually design a set of surrogate stencil functions $\widetilde{\Phi}_{\bm{\delta}}$ which preserves the symmetry of the stiffness matrix (see, e.g., [21, Remark 3.4]). Second, in general, it is difficult to generalize the row sum trick in 18c, in a way which preserves symmetry, when the bilinear form $a(\cdot,\cdot)$ has a multi-dimensional kernel (cf. Section 9). *

5.4 Polynomial reproduction

Until now, we have focused, almost entirely, on the analysis of surrogate stiffness matrices which derive from the bilinear form generated by Poisson’s equation. Clearly, the methodology presented above can be applied to other settings as well. The general scenario we are interested in is when $a(\cdot,\cdot)$ in 1a can be expressed in the parametric domain as

[TABLE]

We now consider the general coefficient matrix $\mathsf{A}$ and a cardinal B-spline basis $\{\widehat{B}_{i}\}$ (cf. Section 3.2). Invoking 19b, a simple change of variables leads us to

[TABLE]

where, as before, ${\bm{\delta}}=\widetilde{\bm{x}}_{j}-\widetilde{\bm{x}}_{i}$ and $\widehat{\omega}_{\bm{\delta}}=\operatorname*{supp}(\widehat{B})\cap\operatorname*{supp}(\widehat{B}_{{\bm{\delta}}})$ . Using the techniques put forth in Section 3.3, this expression can easily be generalized for cardinal NURBS bases with polynomial weight functions $W(\widehat{\bm{x}})$ . However, considering only the case of a cardinal B-spline basis, if $G(\cdot,\cdot,\cdot)$ is a $\mathcal{Q}_{p}(\widehat{\Omega})$ polynomial in its first argument, we have the following reproduction property (cf. Remark 4.5).

Proposition 5.2.

Assume that 19a and 19b hold. For all $\widehat{\bm{x}}\in\widetilde{\Omega}$ , define

[TABLE]

*If $G(\cdot,\widehat{\bm{y}},\widehat{\bm{y}})\in\mathcal{Q}_{p}(\widehat{\Omega})$ , for every $\widehat{\bm{y}}\in\widehat{\Omega}$ , then $\Phi_{\bm{\delta}}\in\mathcal{Q}_{q}(\widetilde{\Omega})$ . Moreover, taking 18a as the definition of the surrogate $\widetilde{\mathsf{A}}$ , it holds that $\widetilde{\mathsf{A}}=\mathsf{A}$ . *

Proof 5.3.

It suffices to show that $\widetilde{\Phi}_{\bm{\delta}}=\Phi_{\bm{\delta}}$ , for all ${\bm{\delta}}\in\mathscr{D}$ . Let ${\bm{\alpha}}=(\alpha_{1},\ldots,\alpha_{n})$ be a multi-index, $\widetilde{\bm{x}}^{\bm{\alpha}}=\widetilde{x}_{1}^{\alpha_{1}}\cdots\widetilde{x}_{n}^{\alpha_{n}}$ . By assumption, we may express $G(\widetilde{\bm{x}},\widehat{B}_{\bm{\delta}}(\widehat{\bm{y}}),\widehat{B}(\widehat{\bm{y}}))=\sum_{i=1}^{n}\sum_{\alpha_{i}\leq p}c_{\bm{\alpha}}(\widehat{\bm{y}})\widetilde{\bm{x}}^{\bm{\alpha}}$ , where each coefficient function $c_{\bm{\alpha}}(\widehat{\bm{y}})$ has support only in $\widehat{\omega}_{\bm{\delta}}$ . Moreover, if $\widetilde{\bm{x}}+\widehat{\bm{y}}\in\widehat{\Omega}$ , then

[TABLE]

*is clearly an equal degree polynomial in the $\widetilde{\bm{x}}$ -variable. The proof is completed noting that the integral in 21 is performed in the $\widehat{\bm{y}}$ -variable over the set $\widehat{\omega}_{{\bm{\delta}}}$ and $\operatorname{\subseteq}\widehat{\Omega}$ , for every ${\bm{\delta}}\in\mathscr{D}$ . *

6 Surrogate matrices: Faster assembly with existing software

All the examples in this paper were implemented using the GeoPDEs package for Isogeometric Analysis in MATLAB and Octave [20, 42]. This package provides a framework for implementing and testing new isogeometric methods for the solution of partial differential equations. We reused most of the original low-level functions and only had to make changes to some high-level assembly functions. A detailed explanation of the code modifications and extensions is provided in [22]. Additionally, a reference implementation with example code is available in the git repository [1]. Nonetheless, we give here a short explanation of the implementation in GeoPDEs. In particular, for the Poisson problem, we modified op_gradu_gradv_tp using the following strategy:

Starting from the first point $\widetilde{\bm{x}}_{i}\in{\widetilde{\Omega}}\cap\widetilde{\mathbb{X}}$ , let $\widetilde{\bm{\Xi}}$ be the lattice containing every $M{}^{\text{th}}$ point $\widetilde{\bm{x}}_{i}\in{\widetilde{\Omega}}\cap\widetilde{\mathbb{X}}$ , in each Cartesian direction. When $M$ does not evenly divide these points in any given Cartesian direction, include the $M{}^{\text{th}}$ endpoints $\widetilde{\bm{x}}_{i}\in{\partial\widetilde{\Omega}}\cap\widetilde{\mathbb{X}}$ as well; see, e.g., the black dots in Figure 7. The stencil functions $\Phi_{\bm{\delta}}(\widetilde{\bm{x}})$ are evaluated at all points $\widetilde{\bm{x}}_{i}^{\mathrm{s}}\in\widetilde{\bm{\Xi}}$ and these values are used as the support points of the ensuing B-spline interpolant $\widetilde{\Phi}_{\bm{\delta}}(\widetilde{\bm{x}})$ .

An additional benefit of this choice is seen in that $\widetilde{\Phi}_{\bm{\delta}}(\widetilde{\bm{x}}_{i}^{\mathrm{s}})=\Phi_{\bm{\delta}}(\widetilde{\bm{x}}_{i}^{\mathrm{s}})$ , at each point $\widetilde{\bm{x}}_{i}^{\mathrm{s}}\in\widetilde{\bm{\Xi}}$ . This leads to an increased point-wise accuracy and lower potential cost, since each entry $\widetilde{\mathsf{A}}_{ij}=\widetilde{\Phi}_{\bm{\delta}}(\widetilde{\bm{x}}_{i}^{\mathrm{s}})$ in the surrogate stiffness matrix is equal to the correct entry, $\mathsf{A}_{ij}=\Phi_{\bm{\delta}}(\widetilde{\bm{x}}_{i}^{\mathrm{s}})$ . Here, it is appropriate to point that when $M=1$ every point $\widetilde{\bm{x}}_{i}$ is sampled, $\widetilde{\bm{\Xi}}=\widetilde{\Omega}\cap\widetilde{\mathbb{X}}$ . In this case, $H=h$ and there is no difference from the surrogate $\widetilde{\mathsf{A}}$ and the true $\mathsf{A}$ .

In order to evaluate $\Phi_{\bm{\delta}}(\widetilde{\bm{x}}_{i}^{\mathrm{s}})$ , we identify the matrix rows which correspond to the sampling points $\widetilde{\bm{x}}_{i}^{\mathrm{s}}\in\widetilde{\bm{\Xi}}$ . Additionally, we include the rows which correspond to basis functions near the domain boundary. Each of these rows to be assembled using quadrature formulas; see the red and green points in Fig. 8. The number of interior rows depends on $M$ , whereas the number of rows corresponding to the boundary depends on the order of the basis functions $p$ . After that, we identify all of the active elements which need to be assembled to compute the estimated rows, cf. Fig. 9. Note that the number of required elements for each sample point depends on the order $p$ . In order to assemble these elements, we employ the op_gradu_gradv function, but skip the elements which are not active.

To construct the interpolated stencil functions $\widetilde{\Phi}_{\bm{\delta}}$ , it is possible to use the builtin MATLAB functions interp2 and interp3. However, in 2D, we use the RectBivariateSpline function provided by the SciPy Python package [31], which supports spline interpolation up to order $5$ . These interpolated stencil functions are then evaluated in order to retrieve the remaining values of $\widetilde{\mathsf{A}}$ ; cf. the blue off-diagonal entries in Fig. 8. The assembly functions for the mass matrix, the biharmonic equation, and the Stokes problem were modified in a similar way. For symmetric matrices, only the upper-diagonal entries are interpolated and copied to the lower-diagonal entries (cf. 18b). In the Poisson and biharmonic case, we additionally enforce the zero-row sum property by changing the diagonal entries for all rows which include at least one interpolated value (cf. 18c).

In order to show that the surrogate approach may be easily applied to other IGA frameworks, we tried to keep the modifications as simple as possible. However, in an IGA implementation tailored to the surrogate approach, even more properties may be exploited to achieve better performance. For example, in the current implementation, the complete local stiffness matrices of the active elements are computed via quadrature, but in practice only a single row of the local matrix is required. Exploiting this fact would save a significant amount of unnecessary computation, especially as $p$ grows, but such an implementation in GeoPDEs would also involve the modification of low-level functions.

Remark 6.1.

7 Poisson’s equation

In this section, we analyze a surrogate IGA discretization of Poisson’s equation on the domain $\Omega=\bm{\varphi}(\widehat{\Omega})$ . Here, as well as in the forthcoming problems, we restrict our attention to Dirichlet boundary conditions. This simplifies the analysis while also retaining all of its interesting features. Given a function $f\in L^{2}(\Omega)$ , the corresponding weak form is the following:

[TABLE]

where $a(u,v)=\int_{\Omega}\nabla u\cdot\nabla v\,\mathrm{d}x$ and $F(v)=\int_{\Omega}f\hskip 0.25ptv\,\mathrm{d}x$ . At this point, it has been made well-understood that the bilinear form $a(\cdot,\cdot)$ can be rewritten on the parametric domain $\widehat{\Omega}$ , using the expression 8. Recalling this detail, we continue on with the simplifying assumption $K\in\big{[}W^{q+1,\infty}(\widehat{\Omega})\big{]}^{n\times n}$ .

7.1 Inconsistency

Recall 1 and 4 and take $V_{h}=\mathrm{span}\{N_{i}\}$ , where every $N_{i}=\widehat{N}_{i}\circ\bm{\varphi}^{-1}$ . Analysis of surrogate methods best proceeds using the surrogate bilinear form $\widetilde{a}:V_{h}\times V_{h}\to\mathbb{R}$ inherent to the surrogate matrix $\widetilde{\mathsf{A}}$ . Explicitly,

[TABLE]

for all $w_{h}=\sum_{i}\mathsf{w}_{i}N_{i},v_{h}=\sum_{i}\mathsf{v}_{i}N_{i}\in V_{h}$ . Here and throughout, we shall use definition 18c in constructing the surrogate stiffness matrix $\widetilde{\mathsf{A}}$ and its associated surrogate bilinear form $\widetilde{a}(\cdot,\cdot)$ .

In the proceeding analysis, Theorem 7.3 is of fundamental importance. Its proof is a simple consequence of Theorems 4.3 and 7.1. From now on, for any matrix $\mathsf{N}$ , we use the notation $|\mathsf{N}|_{\operatorname*{\vphantom{p}max}}=\operatorname*{\vphantom{p}max}_{i\neq j}|\mathsf{N}_{ij}|$ .

Lemma 7.1.

For all $v_{h},w_{h}\in V_{h}$ , the following upper bound holds:

[TABLE]

*where $C_{3}$ is a constant depending only on $\bm{\varphi}$ and $p$ . *

Proof 7.2.

Theorem 7.3.

Let $C_{4}=C_{2}\cdot C_{3}$ . For all $v_{h},w_{h}\in V_{h}$ , the following upper bound holds:

[TABLE]

Proof 7.4.

Obviously, we are done if $a(\cdot,\cdot)=\widetilde{a}(\cdot,\cdot)$ . Therefore, assume $a(\cdot,\cdot)\neq\widetilde{a}(\cdot,\cdot)$ and let $i<j$ be the indices of the maximal value $|\mathsf{A}_{ij}-\widetilde{\mathsf{A}}_{ij}|=\big{|}\mathsf{A}-\widetilde{\mathsf{A}}\big{|}_{\operatorname*{\vphantom{p}max}}>0$ . Since $\widetilde{\mathsf{A}}$ is defined by 18c, Theorem 4.3 leads us to the inequality

[TABLE]

*The proof is completed using Lemma 7.1. *

Remark 7.5.

In Lemma 7.1, it is important that $\widetilde{\mathsf{A}}$ be defined using 18c. Indeed, in 27, it is the symmetry and the zero row sum property preserved in this definition which allows $|a(v_{h},w_{h})-\widetilde{a}(v_{h},w_{h})|$ to be bounded by products of differences in the coefficients $\mathsf{v}_{i}$ and $\mathsf{w}_{j}$ . If we had used definition 18a or 18b, one would have to directly work with the upper bound

[TABLE]

This, in turn, can only be finessed to arrive at an inequality like

[TABLE]

*for some other constant $C_{3}^{\prime}$ , depending only on $\bm{\varphi}$ and $p$ . Notice the loss of an $h^{2}$ scaling factor when comparing 26 and 34.

This difference could greatly affect solution accuracy.

7.2 A priori error estimation

Define $V_{h,0}=V_{h}\cap H^{1}_{0}(\Omega)$ . The following lemma is identical in spirit to [21, Theorem 7.1]. By the Lax–Milgram theorem, this lemma allows us to conclude that there exists a unique surrogate solution corresponding to 24, namely $\widetilde{u}_{h}\in V_{h,0}$ , for every sufficiently small $H>0$ .

Lemma 7.6.

*Let $\alpha=(1+C_{P})^{-1}$ , where $C_{P}$ is the Poincaré constant for the domain $\Omega$ . If $H^{q+1}<{\alpha}\cdot C_{4}^{-1}$ , then the surrogate bilinear form $\widetilde{a}:V_{h}\times V_{h}\to\mathbb{R}$ is coercive on $V_{h,0}$ . Letting $\widetilde{\alpha}>0$ be the associated coercivity constant, it also holds that $\widetilde{\alpha}\to\alpha$ , as $H\to 0$ . *

Proof 7.7.

Let $S=\{v\in H^{1}_{0}(\Omega)\,:\,\|v\|_{1}=1\}$ be the surface of the unit ball in $H^{1}_{0}(\Omega)$ . Notice that $\alpha\leq a(v_{h},v_{h})\leq\widetilde{a}(v_{h},v_{h})+|a(v_{h},v_{h})-\widetilde{a}(v_{h},v_{h})|$ for all $v_{h}\in V_{h}\cap S$ and, therefore,

[TABLE]

Invoking Lemma 7.1, the second term on the left may be bounded from above as follows:

[TABLE]

*Clearly, if $C_{4}H^{q+1}<{\alpha}$ , then $0<\alpha-|a(v_{h},v_{h})-\widetilde{a}(v_{h},v_{h})|\leq\widetilde{a}(v_{h},v_{h})$ , as necessary. *

Theorem 7.8.

Let $\theta>1$ . If $u\in H^{p+1}(\Omega)$ , then there exists a constant $C_{5}$ , depending only on $p$ and $\bm{\varphi}$ , such that

[TABLE]

*for every sufficiently small $H>0$ . *

Proof 7.9.

We first prove 37a. Let $u_{h}\in V_{h,0}$ be the solution of the standard IGA discretization 1b associated to 24. Clearly, $\|u-\widetilde{u}_{h}\|_{1}\leq\|u-u_{h}\|_{1}+\|u_{h}-\widetilde{u}_{h}\|_{1}$ . Moreover, by [13, Theorem 6.1], $\|u-u_{h}\|_{1}\leq C_{5}h^{p}\|u\|_{p+1}$ . Recalling Lemma 7.6, we find that

[TABLE]

After invoking Theorem 7.3, it now readily follows that $\|u_{h}-\widetilde{u}_{h}\|_{1}\leq\widetilde{\alpha}^{-1}C_{4}H^{q+1}\|\nabla u\|_{0}$ . Since $\widetilde{\alpha}\to\alpha$ , in the limit $H\to 0$ , it also holds that $\|u_{h}-\widetilde{u}_{h}\|_{1}\leq\theta\cdot\alpha^{-1}C_{4}H^{q+1}\|\nabla u\|_{0}$ , for all sufficiently small $H>0$ .

Our proof of 37b, also involves the triangle inequality: $\|u-\widetilde{u}_{h}\|_{0}\leq\|u-u_{h}\|_{0}+\|u_{h}-\widetilde{u}_{h}\|_{0}$ . If $\Omega$ is convex, then $\|u-u_{h}\|_{0}\leq C_{6}h^{p+1}\|u\|_{p+1}$ . Next, find $w_{h}\in V_{h,0}$ satisfying $a(w_{h},v_{h})=(u_{h}-\widetilde{u}_{h},v_{h})_{\Omega}$ , for all $v_{h}\in V_{h,0}$ . It holds that $\|\nabla w_{h}\|_{0}\leq C_{P}\|u_{h}-\widetilde{u}_{h}\|_{0}$ . Now, observe that $\|u_{h}-\widetilde{u}_{h}\|_{0}^{2}=a(w_{h},u_{h}-\widetilde{u}_{h})=\widetilde{a}(w_{h},\widetilde{u}_{h})-a(w_{h},\widetilde{u}_{h})$ . Finally, invoke Theorem 7.3 to arrive at

[TABLE]

*which works to deliver the stated estimate, since $\|\nabla\widetilde{u}_{h}\|_{0}\to\|\nabla u_{h}\|_{0}\leq\|\nabla u\|_{0}$ , as $H\to 0$ . *

7.3 Numerical experiments

The two bounds 37a and 37b need to be experimentally verified. Because both estimates depend on the two scales $h$ and $H$ , in order to be overtly thorough, we should provide verification in both scales, independently. That is, first holding the $h$ -scale fixed and varying $H$ and, alternatively, holding the $H$ -scale fixed and varying $h$ . We forgo this verification step and instead refer the reader to similar studies with low-order finite elements in [6, 4, 5, 21]. In this section, therefore, we verify the bounds above under the assumption $H=H(h)$ . We expect that this would be the typical use case.

7.3.1 Overview

In our experiments with Poisson’s equation 24, we considered a $2$ D and a $3$ D domain $\Omega$ ; see Figure 10. The $2$ D domain is given by a second-order NURBS parameterization and the $3$ D domain is parameterized by second-order B-splines. This particular $2$ D domain (Figure 10 (a)) was chosen instead of the $2$ D quarter annulus featured later (Figure 16) in order to break some symmetries which would otherwise appear in its stencil functions (Figure 7). (Additional stencil function symmetries can appear if $\bm{\varphi}\colon\widehat{\Omega}\to\Omega$ is a conformal mapping.) We also note that, after fixing the NURBS parameterizations for the edges on the boundary of the $2$ D domain, the parameterization $\bm{\varphi}\colon\widehat{\Omega}\to\Omega$ was generated as a Coons patch [37]. Because each of the four edges had polynomial weight functions in their parameterizations, the resulting Coons patch NURBS parameterization, $\bm{\varphi}\colon\widehat{\Omega}\to\Omega$ , contained a global polynomial weight function $W(\widehat{\bm{x}})=\sum_{j}w_{j}\widehat{B}_{j}(\widehat{\bm{x}})$ . Consequently, the assumptions of Theorem 4.3 were satisfied in all of our experiments.

In constructing $\widetilde{\mathsf{A}}$ (cf. Section 6) and, thus, the surrogate stencil functions $\widetilde{\Phi}_{\bm{\delta}}=\Pi_{H}\Phi_{\bm{\delta}}$ , we tried local B-spline interpolants $\Pi_{H}\colon C^{0}({\widetilde{\Omega}})\to S_{q}(\widetilde{\bm{\Xi}})$ of various order $1\leq q\leq 5$ . As stated previously (see Remark 4.2), we used the convention $\widetilde{\bm{\Xi}}\operatorname{\subseteq}\widetilde{\Omega}\cap\widetilde{\mathbb{X}}$ . In our $2$ D experiments, we explored both $p=2$ or $p=3$ NURBS bases and the full range of possible $q$ . In our $3$ D experiments, we only explored the case $p=2$ using $q=1,3$ . This limitation of our $3$ D experiments was due to the restraints of the MATLAB B-spline interpolation routine we were using.

In the first set of experiments (Figures 11, 12 and 13), the sampling length $H$ was set to $H=Mh$ , with the constant $M=5$ . In the second set of experiments (Figures 14 and 15), we used a mesh-dependent sampling length, defined below.

7.3.2 Constant sampling lengths

For the constant sampling strategy, with $M=5$ , convergence plots of the errors in surrogate solutions are presented in Figures 11, 12 and 13. After inspecting these figures, the reader should observe that there is a noticeable difference in the accuracy of the surrogate solutions $\widetilde{u}_{h}$ , dependent on the load $f$ . For instance, with the “low-frequency” manufactured solution, $u(\bm{x})=\sin\left(\pi\,x_{1}\right)\,\sin\left(\pi\,x_{2}\right)$ , the errors in the surrogate solutions appear to converge to the error generated by the (standard) non-surrogate IGA solution $u_{h}$ asymptotically, at various rates. However, for the “high-frequency” manufactured solution, $u(\bm{x})=\sin\left(20\,\pi\,x_{1}\right)\,\sin\left(20\,\pi\,x_{2}\right)$ , each of the surrogate errors and the corresponding standard IGA error are nearly indistinguishable (except, sometimes, in the case $q=1$ ).

This difference can be explained in a simple way. Observe that the $h$ -dependent terms in 37a and 37b are multiplied by the high-order norm $\|u\|_{p+1}$ and, meanwhile, the $H$ -dependent terms are only multiplied by the lower-order seminorm $\|\nabla u\|_{0}$ . Let us call the first term in each bound the discretization error and the second term the consistency error. Because of the presence of the high-order norm $\|u\|_{p+1}$ , the discretization error is much more sensitive to irregularities or oscillations in the solution. Another important detail not to overlook is that $C_{4}$ ultimately involves high-order norms of the tensor coefficient $K(\widehat{\bm{x}})$ . Meanwhile, $C_{5}$ and $C_{6}$ depend (through $\Omega$ ), at most, on the $H^{2}$ -norm of $K(\widehat{\bm{x}})$ .

These observations can lead in many interesting directions, but the conclusion which the reader should ultimately arrive at is that the assembly of the stiffness matrix should only be carried out to the accuracy required by the problem at hand. Therefore, the correct surrogate assembly strategy must take into account properties of the problem geometry and the given loads, as well as each of the parameters $h$ , $p$ , and $q$ . A simple way of doing this is to use mesh-dependent sampling lengths.

Remark 7.10.

*In [21], it was documented that the $L^{2}$ error converged like $h^{p+1}+H^{q+2}$ when $\Pi_{H}$ resembles an $L^{2}$ projection. However, when using an interpolation instead, like it is done in this work, the error converges like $h^{p+1}+H^{q+2}$ whenever the interpolation order $q$ is even and like $h^{p+1}+H^{q+1}$ otherwise. This parity effect can also be observed in results of this work; see, e.g., the top right plot in Fig. 12. We do not prove this improved convergence rate but refer to proofs of quadrature formulas where the same parity effect can be observed *

7.3.3 Mesh-dependent sampling lengths

In a successful surrogate method, the discretization error should always dominate the consistency error. Of course, however, this domination should not be exercised to such an extent that overall efficiency is compromised. As a rule of thumb, keeping the consistency error at or below $5\%$ of the discretization error is often acceptable. Even within this somewhat conservative threshold, balancing the two errors appropriately can lead to very significant performance advantages.

Let $q>p$ and consider the mesh-dependent sampling length $M(h)=\operatorname*{\vphantom{p}max}\Big{\{}1,\big{\lfloor}c\cdot h^{\frac{p-q+\beta}{q+1}}\big{\rfloor}\Big{\}}$ , where both $c,\beta\geq 0$ are tuneable parameters. Returning the definition $H=M\cdot h$ , we now find that, for any $c>0$ ,

[TABLE]

Here, for any $\beta>0$ , both the $H^{1}$ and $L^{2}$ discretization error will eventually dominate their associated consistency error. Therefore, this parameter exists to maintain the dominance of the discretization error, throughout mesh refinements. We found $\beta=\nicefrac{{1}}{{2}}$ to be a suitable choice for our purposes. The parameter $c$ , on the other hand, must be calibrated to the problem at hand.

All run-time measurements in this sections were obtained on a machine equipped with two Intel® Xeon® Gold 6136 processors with a nominal base frequency of 3.0 GHz. Each processor has 12 physical cores which results in a total of 24 physical cores, but only single core was used to run the following experiments. The total available memory of $251\text{\,}\mathrm{GB}$ is split into two NUMA domains, one for each socket. For compatibility reasons , we employed the provided by the operating system , but using other optimized libraries might improve the performance of the MATLAB solver.

In Figure 14 we show the assembly times our implementation accrued for various values of $c$ , when $p=2$ and $q=5$ . Inspection of Figure 15 clearly shows that $c=0.75$ is suitable for $u(\bm{x})=\sin\left(\pi\,x_{1}\right)\,\sin\left(\pi\,x_{2}\right)$ and $c=3$ is suitable for $u(\bm{x})=\sin\left(20\,\pi\,x_{1}\right)\,\sin\left(20\,\pi\,x_{2}\right)$ .

When compared to the non-surrogate assembly strategy at just over one million degrees of freedom, note that the first choice gives an assembly speed-up of around $1500\%$ and the second choice gives a speed-up of more than $5000\%$ . These enormous speed-ups are in fact not that surprising after inspecting the percentage of matrix entries which must be computed using traditional quadrature formulas; see, again, Figure 14.

8 Transverse vibrations of an isotropic membrane

One of the great advantages of the IGA paradigm is its superior accuracy in structural vibration problems. Any worthwhile surrogate method should maintain this advantage. Therefore, in this section, we extend the analysis of the previous section to the analysis of transverse vibrations of a two-dimensional elastic membrane. The corresponding weak form is the following:

[TABLE]

where $a(w,v)=\int_{\Omega}\nabla w\cdot\nabla v\,\mathrm{d}x$ and $m(w,v)=\int_{\Omega}w\hskip 0.25ptv\,\mathrm{d}x$ . It is well-known that there are countably many solutions to this problem, $\{(u^{(k)},\lambda^{(k)})\}_{j=1}^{\infty}$ , where each $\lambda^{(k)}>0$ . From here on, we make the ordering assumption $\lambda^{(j)}\leq\lambda^{(k)}$ , for every $j<k$ .

8.1 Surrogate mass matrices

So far, we have only rigorously analyzed surrogates of the elliptic bilinear form $a(\cdot,\cdot)$ , written above. Using the techniques developed thus far, results similar to Theorem 7.3 can be proven for other bilinear forms, such as $m(\cdot,\cdot)$ . Let the corresponding mass matrix and surrogate mass matrix be denoted $\mathsf{M}$ and $\widetilde{\mathsf{M}}$ , respectively. Employing definition 18b, we take for granted that the accompanying surrogate bilinear form $\widetilde{m}(\cdot,\cdot)$ satisfies

[TABLE]

for some $C_{7}$ depending only on $\bm{\varphi}$ , $p$ , $q$ , and $\|\Pi_{H}\|$ .

In the special case that the geometry mapping $\bm{\varphi}(\widehat{\bm{x}})$ is a polynomial, we have a surrogate reproduction property of the mass form $m(u,v)$ . This property is formalized in the following corollary to Proposition 5.2.

Corollary 8.1.

*Assume that the domain mapping $\bm{\varphi}\colon\widehat{\Omega}\rightarrow\Omega$ is defined through a polynomial of order $p$ , i.e., $\bm{\varphi}\in\big{[}\mathcal{Q}_{p}(\widehat{\Omega})\big{]}^{n}$ . Let $\mathsf{M}$ be the coefficient matrix arising from the discretization of $m(\cdot,\cdot)$ and $\widetilde{\mathsf{M}}$ the corresponding surrogate matrix. If $q\geq n\cdot p-1$ , it holds that $\mathsf{M}=\widetilde{\mathsf{M}}$ . *

Proof 8.2.

Let $\bm{J}(\widehat{\bm{x}})$ be the Jacobian (tensor) of $\bm{\varphi}(\widehat{\bm{x}})$ . The transformation of the integral from the physical to the reference domain is given by

[TABLE]

*It remains to show that $G(\widehat{\bm{x}},\widehat{u}(\widehat{\bm{y}}),\widehat{v}(\widehat{\bm{y}}))=\widehat{u}(\widehat{\bm{y}})\hskip 0.25pt\widehat{v}(\widehat{\bm{y}})\,\det(\bm{J}(\widehat{\bm{x}}))$ is a polynomial of degree $n\cdot p-1$ in the $\widehat{\bm{x}}$ -variable. Since $\bm{\varphi}\in\big{[}\mathcal{Q}_{p}(\widehat{\Omega})\big{]}^{n}$ , it holds that $\det(\bm{J})\in\mathcal{Q}_{n\cdot p-1}(\widehat{\Omega})$ . Applying Proposition 5.2 yields the desired reproduction property. *

8.2 A priori analysis of the eigenvalue error

Adopt the obvious notation, $(u_{h}^{(k)},\lambda_{h}^{(k)})$ and $(\widetilde{u}_{h}^{(k)},\widetilde{\lambda}_{h}^{(k)})$ , for the standard IGA and surrogate IGA solutions corresponding to 41, respectively. Due to space limitations, we only analyze the convergence of the eigenvalues $\widetilde{\lambda}_{h}^{(k)}\to\lambda^{(k)}$ .

Theorem 8.3.

Let $\theta>1$ . If $u^{(k)}\in H^{p+1}(\Omega)$ , then there exists a constant $C_{8}$ , depending only on $p$ and $\bm{\varphi}$ , such that

[TABLE]

*for every sufficiently small $H>0$ . *

Proof 8.4.

Clearly, $|\lambda^{(k)}-\widetilde{\lambda}_{h}^{(k)}|\leq|\lambda^{(k)}-\lambda_{h}^{(k)}|+|\lambda_{h}^{(k)}-\widetilde{\lambda}_{h}^{(k)}|$ . It is known that if $u^{(k)}\in H^{p+1}(\Omega)$ , then $|\lambda^{(k)}-\lambda_{h}^{(k)}|\leq C_{8}\lambda^{(k)}h^{2p}$ . We now focus on the term $|\lambda_{h}^{(k)}-\widetilde{\lambda}_{h}^{(k)}|$ .

Let $V_{h}^{(k)}$ be the set of all $k$ -dimensional subsets of $V_{h}$ and fix $\theta>1$ . Two important members of this set are $E_{h}^{(k)}=\mathrm{span}\big{\{}u_{h}^{(l)}\big{\}}_{l=1}^{k}$ and $\widetilde{E}_{h}^{(k)}=\mathrm{span}\big{\{}\widetilde{u}_{h}^{(l)}\big{\}}_{l=1}^{k}$ . Observe that

[TABLE]

Note that $\frac{\widetilde{a}(v,v)}{a(v,v)}=1+\frac{\widetilde{a}(v,v)-a(v,v)}{a(v,v)}\leq 1+C_{4}H^{q+1}$ , by Theorem 7.3. Similarly, by 42, $\frac{m(w,w)}{\widetilde{m}(w,w)}=1+\frac{\widetilde{m}(w,w)-m(w,w)}{\widetilde{m}(w,w)}\leq 1+\theta\cdot C_{7}H^{q+1}$ , in the limit $H\to 0$ . These observations, together with 45, imply

[TABLE]

*for every sufficiently small $H>0$ . Note that 46a implies that $\widetilde{\lambda}_{h}^{(k)}\leq\theta\lambda_{h}^{(k)}$ , as $H\to 0$ . After introducing this inequality into 46b, we arrive at the upper bound $|\lambda_{h}^{(k)}-\widetilde{\lambda}_{h}^{(k)}|\leq\lambda_{h}^{(k)}\,\theta\cdot(C_{4}+C_{7})H^{q+1}$ , in the limit $H\to 0$ . Because $\lambda_{h}^{(k)}\to\lambda^{(k)}$ , we also have $\lambda_{h}^{(k)}\leq\theta\lambda^{(k)}$ , as $H\to 0$ . Only elementary algebra remains in order to arrive at 44. *

Remark 8.5.

*One upshot of Theorem 8.3 is that, if $H=\mathcal{O}(h)$ , then one may wish to choose $q+1>2p$ , in order to recover an optimal spectral convergence rate. Of course, for irregular geometries, it is difficult to maintain the assumption $\lambda^{(j)}\in H^{p+1}(\Omega)$ , and a lower $q$ may still provide a useful approximation. *

8.3 Numerical experiments

Our numerical experiments for problem 41 involve only the two-dimensional quarter annulus domain $\Omega$ , depicted in Figure 16. This domain was chosen instead of Figure 10 (a) so that $u^{(k)}\in H^{3}(\Omega)$ , for each $k$ . Thus, when $p=2$ , Theorem 8.3 concludes that $|\lambda_{h}^{(k)}-\widetilde{\lambda}_{h}^{(k)}|\leq\mathcal{O}(h^{4}+H^{q+1})$ . Figure 17, which shows the convergence of the first nine eigenvalues, for $p=2$ , verifies this result. Recall Remark 7.10. One again witnesses the parity present in the $L^{2}$ error of the Poisson problems above. That is, we actually observe the stronger conclusion $|\lambda_{h}^{(k)}-\widetilde{\lambda}_{h}^{(k)}|=\mathcal{O}(h^{4}+H^{q+2})$ when $q$ is even.

A close inspection of Figure 17 appears to indicate that the accuracy of the surrogate solution improves as the eigenvalues grow. This observation is in line with the previous numerical results for Poisson’s equation, which showed nearly indistinguishable solutions for the “high-frequency” manufactured solution $u(\bm{x})=\sin\left(20\,\pi\,x_{1}\right)\,\sin\left(20\,\pi\,x_{2}\right)$ (cf. Figures 11, 12 and 13). Naturally, we should compare the accuracy of all eigenvalues computed with the standard IGA method to those coming from the surrogate IGA method. This is done, in part, in Figure 18 for both $p=2,3$ . Here, it is more meaningful to use the natural frequencies $(\omega^{(j)})^{2}=\lambda^{(j)}$ . Notice that the differences are extremely small across the entire range of computed frequencies.

9 Plate bending under a transverse load

Another clear advantage of IGA is the simplicity of discretizing high-order PDEs. In this section, we briefly demonstrate that the same features hold true for surrogate IGA methods. As a proof-of-concept, consider the simple Poisson–Kirchoff isotropic plate bending model. Given a function $f\in L^{2}(\Omega)$ , the corresponding weak form is the following:

[TABLE]

where $a(u,v)=\int_{\Omega}\Delta u\,\Delta v\,\mathrm{d}x$ and $F(v)=\int_{\Omega}f\hskip 0.25ptv\,\mathrm{d}x$ .

9.1 A higher-dimensional kernel

With the same principles as used for Poisson’s equation, one may easily design a surrogate IGA method for 47. In our approach, the corresponding surrogate stiffness matrix $\widetilde{\mathsf{A}}$ was also defined using 18c. Notice that this definition does not preserve the entire kernel found in the true IGA stiffness matrix $\mathsf{A}$ . For instance, one may easily verify that all linear functions lie in the kernel of $a(\cdot,\cdot)$ . Therefore, $\mathsf{A}\mathsf{c}^{(1)}=\mathsf{A}\mathsf{c}^{(2)}=0$ , where $\mathsf{c}^{(1)},\mathsf{c}^{(2)}$ are the $x_{1}$ - and $x_{2}$ -coefficients of the control points, respectively. In our experiments, this property was only recovered in the limits $h\to 0$ or $H\to h$ .

9.2 Numerical experiments

Let $\Omega$ be the quarter annulus domain depicted in Figure 16. Figure 19 shows the convergence of the errors, in the $H^{2}$ , $H^{1}$ , and $L^{2}$ norms, corresponding to this geometry $\Omega$ and the manufactured solution $u(\bm{x})=\sin\left(\pi\,x_{1}\right)\,\mathrm{sinh}\left(\pi\,x_{2}\right)$ . Even though the kernel is not preserved, the numerical results we witnessed are similar to those documented for the $p=3$ experiments in the Poisson setting (see top row of Figure 12). For instance, notice that the surrogate error with $q=2$ is parallel to the reference IGA error ( $M=1$ ), in both the $H^{1}(\Omega)$ and $L^{2}(\Omega)$ norms.

Remark 9.1.

*Although we will not provide any rigorous analysis, if we recall Remark 7.5, the similarity between our Poisson results and those above may still appear somewhat surprising. Indeed, since only the zero row sum property is inherited in the surrogate $\widetilde{\mathsf{A}}$ when using 18c, we can not improve on the upper bound in Lemma 7.1. Had the entire kernel of $\mathsf{A}$ been preserved in $\widetilde{\mathsf{A}}$ , we conjecture that an optimal form of this bound would involve an $h^{4-n}$ scaling factor. Such a factor should permit a surrogate solution $\widetilde{u}_{h}$ of two $h$ -orders higher accuracy. *

10 Stokes’ flow

In this section, we consider a surrogate IGA discretization of Stokes’ flow in a domain $\Omega\operatorname{\subseteq}\mathbb{R}^{n}$ . Given a viscosity $\mu\in\mathbb{R}_{>0}$ , a function $\mathbf{f}\in\big{[}L^{2}(\Omega)\big{]}^{n}$ , and a velocity field on the boundary $\mathbf{g}\in\big{[}H^{\mathchoice{\raisebox{0.0pt}{$ \displaystyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \textstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \scriptstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}{\raisebox{0.0pt}{$ \scriptscriptstyle{}\mathchoice{\raisebox{-0.2pt}{ $\displaystyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\textstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptstyle{}^{1}\!$ }}{\raisebox{-0.2pt}{ $\scriptscriptstyle{}^{1}\!$ }}/\mathchoice{\raisebox{-0.1pt}{ $\displaystyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\textstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptstyle{}_{\!2}$ }}{\raisebox{-0.1pt}{ $\scriptscriptstyle{}_{\!2}$ }} $}}}(\partial\Omega)\big{]}^{n}$ , $\int_{\partial\Omega}\mathbf{g}\cdot\bm{n}\,\mathrm{d}s=0$ , the corresponding weak form is the following:

[TABLE]

where $a(\mathbf{u},\mathbf{v})=\int_{\Omega}\mu\hskip 0.25pt\nabla\mathbf{u}\cdot\nabla\mathbf{v}\,\mathrm{d}x$ , $b(p,\mathbf{v})=\int_{\Omega}p\,\nabla\cdot\mathbf{v}\,\mathrm{d}x$ , and $F(\mathbf{v})=\int_{\Omega}\mathbf{f}\cdot\mathbf{v}\,\mathrm{d}x$ . In this scenario, the pressure is not unique up to a constant, therefore we enforce the pressure to have zero mean value, i.e., $\int_{\Omega}p\,\mathrm{d}x=0$ .

10.1 Surrogate divergence matrices

Since no symmetry can be exploited, the surrogate divergence matrices $\mathsf{B}$ are constructed by employing definition 18a. Similarly, as in the mass term arising in Section 8, we have a surrogate reproduction property for the divergence form $b(q,\mathbf{u})$ , when the geometry map is described by a polynomial. This property is formalized in the following corollary of Proposition 5.2.

Corollary 10.1.

*Assume that the domain mapping $\bm{\varphi}\colon\widehat{\Omega}\rightarrow\Omega$ is defined through a polynomial of order $p$ , i.e., $\bm{\varphi}\in\big{[}\mathcal{Q}_{p}(\widehat{\Omega})\big{]}^{n}$ . Let $\mathsf{B}$ be the coefficient matrix arising from the discretization of $b(\cdot,\cdot)$ and $\widetilde{\mathsf{B}}$ the corresponding surrogate matrix. If $q\geq(n-1)p$ , it holds that $\mathsf{B}=\widetilde{\mathsf{B}}$ . *

Proof 10.2.

In the following, we only consider the cases $n=2$ and $n=3$ . Let $\bm{J}(\widehat{\bm{x}})$ be the Jacobian of $\bm{\varphi}(\widehat{\bm{x}})$ . Assuming a gradient preserving transformation to the reference domain, the divergence $\nabla\cdot\mathbf{u}$ in the physical domain is transformed to $\mathrm{tr}(\bm{J}^{-1}\widehat{\nabla}\widehat{\mathbf{u}})$ , where $\mathrm{tr}$ is the trace. Using the property $\bm{J}^{-1}=\det(\bm{J})^{-1}\mathrm{adj}(\bm{J})$ , where $\mathrm{adj}(\bm{J})$ is the adjugate of $\bm{J}$ , the bilinear form $b(\cdot,\cdot)$ may be written as

[TABLE]

It remains to show that $G(\widehat{\bm{x}},\widehat{p}(\widehat{\bm{y}}),\widehat{\mathbf{u}}(\widehat{\bm{y}}))=\widehat{p}(\widehat{\bm{y}})\,\mathrm{tr}\left(\mathrm{adj}(\bm{J}(\widehat{\bm{x}}))\widehat{\nabla}\widehat{\mathbf{u}}(\widehat{\bm{y}})\right)$ is a polynomial of degree $(n-1)p$ in the $\widehat{\bm{x}}$ -variable. Applying Proposition 5.2 yields the desired reproduction property. The trace operator $\mathrm{tr}$ is linear, thus it suffices to analyze the entries of $\mathrm{adj}(\bm{J})$ . In 2D, the components of $\mathrm{adj}(\bm{J})$ and $\bm{J}$ only differ by their position and sign. Since each component of $\bm{J}$ is an element of $\mathcal{Q}_{p}(\widehat{\Omega})$ , we conclude that each component of $\mathrm{adj}(\bm{J})$ is also in $\mathcal{Q}_{p}(\widehat{\Omega})$ . In 3D, the components of $\mathrm{adj}(\bm{J})$ are made up of determinants of $2\times 2$ sub-matrices of $\bm{J}$ . Taking the trace yields

[TABLE]

*For the first summand, we have $\bm{J}_{22}\in\mathcal{P}_{p}\otimes\mathcal{P}_{p-1}\otimes\mathcal{P}_{p}$ , $\mathsf{J}_{23}\in\mathcal{P}_{p}\otimes\mathcal{P}_{p}\otimes\mathcal{P}_{p-1}$ , $\bm{J}_{32}\in\mathcal{P}_{p}\otimes\mathcal{P}_{p-1}\otimes\mathcal{P}_{p}$ , and $\bm{J}_{33}\in\mathcal{P}_{p}\otimes\mathcal{P}_{p}\otimes\mathcal{P}_{p-1}$ . From this it follows that $\bm{J}_{22}\cdot\bm{J}_{33}\in\mathcal{P}_{2p}\otimes\mathcal{P}_{2p-1}\otimes\mathcal{P}_{2p-1}$ and $\bm{J}_{23}\cdot\bm{J}_{32}\in\mathcal{P}_{2p}\otimes\mathcal{P}_{2p-1}\otimes\mathcal{P}_{2p-1}$ . This means that $\bm{J}_{22}\cdot\bm{J}_{33}-\bm{J}_{23}\cdot\bm{J}_{32}\in\mathcal{Q}_{2p}(\widehat{\Omega})$ . Applying the same arguments to the other summands finally yields that $\mathrm{tr}\left(\mathrm{adj}(\bm{J})\right)\in\mathcal{Q}_{2p}(\widehat{\Omega})$ . *

In order to discretize 48, an inf-sup stable space pair is required. For this purpose, we choose the isogeometric subgrid element as described in [7]. In this discretization, the velocity field is defined on a subgrid of the pressure where each pressure element is subdivided into $2^{n}$ elements. This allows for using a velocity space of order $p+1$ with $C^{p}(\widehat{\Omega})$ regularity and a pressure space of order $p$ with $C^{p-1}(\widehat{\Omega})$ regularity.

We do not provide a priori error estimates for the Stokes problem, but in the case that the divergence matrix is reproduced one would follow similar arguments as presented for the Poisson problem. In the scenario where the divergence matrix is not reproduced, further work is necessary. However, the results in the next subsection suggest that the reproduction is not required in order to obtain an optimal order of convergence.

10.2 Numerical experiments

Our computational study of Stokes’ flow is comprised of two separate experiments. In the first experiment, we provide a smooth manufactured solution in order to investigate convergence rates. In the second example, we consider a lid-driven cavity benchmark problem. Due to the discontinuous boundary conditions, the solution of this problem has singularities at two corners of the domain. Both examples are computed on the domain shown in Figure 20 which was constructed by a Coons patch with a cubic boundary parameterization. We discretize the problem using the aforementioned subgrid element with a third order velocity and second order pressure. The viscosity is set to $\mu=1$ for all scenarios.

In the first example, the manufactured solution is chosen to be

[TABLE]

Note that this solution satisfies $\nabla\cdot\mathbf{u}=0$ and the constant $C_{p}\in\mathbb{R}$ is chosen such that the pressure mean is zero. The Dirichlet boundary condition $\mathbf{g}$ and the right hand side $\mathbf{f}$ are set accordingly to match the manufactured solution. In Figure 21, we present convergence plots for the velocity and pressure separately for different surrogate orders $q$ and fixed $M=5$ . For reference, we also include the standard discretization with $M=1$ in these plots. We observe the expected convergence orders for all $q\geq 2$ . In agreement with Corollary 10.1, the divergence matrices were perfectly reproduced for every $q\geq 3$ .

In the second example, we consider a lid-driven cavity benchmark on the domain Figure 20 where the fluid is driven on the top edge by constant velocity $\mathbf{g}=(1,0)^{\top}$ and we assume no-slip boundary conditions $\mathbf{g}=\bm{0}$ on the remaining parts of the boundary. The degrees of freedom corresponding to the nodal basis functions in the top left and top right corner are set to zero. Furthermore, the volume forces are neglected, i.e., $\mathbf{f}=\bm{0}$ . In Figure 22 (a), we show the velocity streamlines which were computed using a standard IGA approach on a mesh with $320\times 320$ control points. The effect of different surrogate approaches on the velocity streamlines may be observed in Figures 22 (b), 22 (c), 22 (d), 22 (e), 22 (f), 22 (g), 22 (h), 22 (i) and 22 (j). In the case $q=3$ , where the divergence matrices are in fact reproduced, the streamlines show the same behavior as in the standard approach even for $M=100$ . For other values of $q$ and $M$ , the streamline behavior is different, but the streamlines are getting closer to the reference solution the larger $q$ and the smaller $M$ becomes. We note that actually using an interpolation order of $q=1,2,3$ in computation is still probably not recommended for standard practice. For instance, assembly using $q=5$ took roughly the same time as either of these lower-order choices and, in this case, the surrogate solution $(\widetilde{\mathbf{u}}_{h},\widetilde{p}_{h})$ should be expected to be even more accurate.

Appendix A Marsden’s identity

The purpose of this appendix is to substantiate some of the claims made in Section 3.3, as well as provide a complete proof of Theorem 4.3. In turn, we adopt all of the notation and assumptions introduced in Section 2.2. We begin by stating Theorem A.1, the proof of which can be found in [18]; see also [19, 15, 16].

Theorem A.1 (Marsden).

Let $\psi_{k}(\widehat{y})=(\xi_{k+1}-\widehat{y})\cdots(\xi_{k+p}-\widehat{y})$ . For any $\widehat{x},\widehat{y}\in[0,1]$ ,

[TABLE]

This theorem allows us to conclude that the expression 12 is valid and, moreover, $w\in\mathcal{Q}_{p}(\widetilde{\Omega}_{\bm{0}})$ . This is shown in two steps.

Corollary A.2.

Let $\Psi(\widetilde{\bm{y}})=\prod_{i=1}^{n}\prod_{j=0}^{p-1}\big{(}\frac{j}{m-p}-\frac{p-1}{2(m-p)}-\widetilde{y}_{i}\big{)}$ and $\bm{p}=(p,\ldots,p)\in\mathbb{N}^{n}$ . Then, for any $\widetilde{\bm{x}},\widetilde{\bm{y}}\in\widetilde{\Omega}_{\bm{0}}$ ,

[TABLE]

Proof A.3.

Observe that, for each $k=p+1,\ldots,m-p$ , it holds that $\widetilde{x}^{(k)}+h=\widetilde{x}^{(k+1)}=\xi_{k+1}+(p+1)\cdot\nicefrac{{h}}{{2}}$ . Recall that $k=p+1,\ldots,m-p$ are exactly the indices of the cardinal B-splines $b_{k}(\widehat{x})=b(\widehat{x}-\widetilde{x}^{(k)})$ . Therefore, by Theorem A.1, we see that

[TABLE]

*for each $i=1,\ldots,n$ . The result now follows from the definitions of $\widehat{B}$ and $\Psi$ . *

Corollary A.2 can be used to write out an elegant expression for any polynomial in $\mathcal{Q}_{p}(\widetilde{\Omega}_{\bm{0}})$ , in terms of cardinal B-splines. Indeed, let ${\bm{\alpha}}=(\alpha_{1},\ldots,\alpha_{n})$ be a multi-index, let $f$ be an arbitrary polynomial in $\mathcal{Q}_{p}(\widetilde{\Omega}_{\bm{0}})$ , and let $D^{{\bm{\alpha}}}$ be the ${\bm{\alpha}}$ -derivative operator, in the variable $\widetilde{\bm{y}}$ . By Taylor’s Theorem, it holds that

[TABLE]

for every $\widetilde{\bm{x}},\widetilde{\bm{y}}\in\widetilde{\Omega}_{\bm{0}}$ . Next, applying $D^{\bm{p}-{\bm{\alpha}}}$ to both sides of 53, we find that

[TABLE]

Together, 55 and 56 imply $f(\widetilde{\bm{x}})=\sum_{\widetilde{\bm{x}}_{i}\in\widetilde{\mathbb{X}}}\widehat{B}(\widetilde{\bm{x}}-\widetilde{\bm{x}}_{i})(Lf)(\widetilde{\bm{x}}_{i})$ , where

[TABLE]

We may now state our second corollary of Theorem A.1.

Corollary A.4.

Let $W(\widehat{\bm{x}})=\sum_{j=1}^{N}w_{j}\widehat{B}_{j}(\widehat{\bm{x}})$ . If $W\in\mathcal{Q}_{p}(\widehat{\Omega})$ , then there exists a polynomial $w\in\mathcal{Q}_{p}(\widetilde{\Omega}_{\bm{0}})$ such that $w_{i}=w(\widetilde{\bm{x}}_{i})$ , for each $\widetilde{\bm{x}}_{i}\in\widetilde{\mathbb{X}}$ . Moreover, there exists a constant $C$ , depending only on $p$ , such that

[TABLE]

Proof A.5.

*Clearly, $w=LW\in\mathcal{Q}_{p}(\widetilde{\Omega}_{\bm{0}})$ . Therefore, one readily determines that $\|w\|_{L^{\infty}(\widetilde{\Omega}_{\bm{0}})}\leq C\|W\|_{W^{p,\infty}(\widetilde{\Omega}_{\bm{0}})}$ , for some constant $C$ , depending only on $p$ . Due to the equivalence of norms on finite dimensional vector spaces (note that the dimension of $\mathcal{Q}_{p}(\widetilde{\Omega}_{\bm{0}})$ depends only on $p$ ) and the fact $\widetilde{\Omega}_{\bm{0}}\operatorname{\subseteq}\overline{\widehat{\Omega}}$ , we immediately arrive at 58. *

We may now complete the proof of Theorem 4.3.

Proof A.6 (Proof of Theorem 4.3).

Let $\widetilde{\bm{x}}\in\widetilde{\Omega}$ be arbitrary. Then, for any ${\bm{\delta}}\in\mathscr{D}$ , we have that $\widetilde{\bm{x}}+{\bm{\delta}}\in\widetilde{\Omega}_{\bm{0}}$ . Moreover, for every multi-index $|{\bm{\alpha}}|=r$ , the product rule can be used to show that

[TABLE]

*for some $C$ depending only on ${\bm{\alpha}}$ . Since both functions $K$ , and $W$ are determined by the choice of $\Omega$ , Corollary A.4 and a scaling argument show that $\|D^{{\bm{\alpha}}}\Phi_{\bm{\delta}}\|_{L^{\infty}(\widetilde{\Omega})}\leq Ch^{n-2}$ , where $C$ now depends on $p$ , ${\bm{\alpha}}$ , and $\bm{\varphi}$ . Lemma 4.1 now completes the proof. *

Acknowledgments

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 800898. This work was also partly supported by the German Research Foundation through the Priority Programme 1648 “Software for Exascale Computing” (SPPEXA) and by grant WO671/11-1.

Bibliography42

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Fork of the Geo PD Es git repository including the surrogate branch. https://github.com/drzisga/geopdes , 2019. [Online; accessed March 13th 2019].
2[2] P. Antolin, A. Buffa, F. Calabrò, M. Martinelli, and G. Sangalli. Efficient matrix computation for tensor-product isogeometric analysis: The use of sum factorization. Computer Methods in Applied Mechanics and Engineering , 285:817–828, 2015.
3[3] F. Auricchio, F. Calabro, T. J. Hughes, A. Reali, and G. Sangalli. A simple algorithm for obtaining nearly optimal quadrature rules for NURBS-based isogeometric analysis. Computer Methods in Applied Mechanics and Engineering , 249:15–27, 2012.
4[4] S. Bauer, M. Huber, S. Ghelichkhan, M. Mohr, U. Rüde, and B. Wohlmuth. Large-scale simulation of mantle convection based on a new matrix-free approach. Journal of Computational Science , 31:60–76, 2019.
5[5] S. Bauer, M. Huber, M. Mohr, U. Rüde, and B. Wohlmuth. A new matrix-free approach for large-scale geodynamic simulations and its performance. In International Conference on Computational Science , pages 17–30. Springer, 2018.
6[6] S. Bauer, M. Mohr, U. Rüde, J. Weismüller, M. Wittmann, and B. Wohlmuth. A two-scale approach for efficient on-the-fly operator assembly in massively parallel high performance multigrid codes. Applied Numerical Mathematics , 122:14–38, 2017.
7[7] A. Bressan and G. Sangalli. Isogeometric discretizations of the stokes problem: stability analysis by the macroelement technique. IMA Journal of Numerical Analysis , 33(2):629–651, 2013.
8[8] A. Bressan and S. Takacs. Sum-factorization techniques in isogeometric analysis. ar Xiv preprint ar Xiv:1809.05471 , 2018.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

The surrogate matrix methodology: Low-cost assembly for isogeometric analysis

Abstract

keywords:

1 Introduction

2 Preliminaries

2.1 Model problems and notation

2.2 Cardinal B-splines and NURBS

3 Surrogate matrices: Exploiting basis structure

3.1 Stencil functions

Remark 3.1**.**

3.2 B-spline basis functions

3.3 NURBS basis functions

Remark 3.2**.**

3.4 Symmetric bilinear forms

3.5 The multi-patch setting

3.6 Edge-based and face-based stencil functions

4 Surrogate matrices: Interpolation of stencil functions

4.1 B-spline interpolation

Lemma 4.1**.**

Remark 4.2**.**

4.2 Regularity of the stencil functions

Theorem 4.3**.**

Proof 4.4** (Proof of Theorem 4.3, under the assumption W(x^)=1W(\widehat{\bm{x}})=1W(x)=1).**

Remark 4.5**.**

5 Surrogate matrices: Preserving structure

5.1 General surrogate matrices

5.2 Symmetry

5.3 Preserving the kernel

Remark 5.1**.**

5.4 Polynomial reproduction

Proposition 5.2**.**

Proof 5.3**.**

6 Surrogate matrices: Faster assembly with existing software

Remark 6.1**.**

7 Poisson’s equation

7.1 Inconsistency

Lemma 7.1**.**

Proof 7.2**.**

Theorem 7.3**.**

Proof 7.4**.**

Remark 7.5**.**

7.2 A priori error estimation

Lemma 7.6**.**

Proof 7.7**.**

Theorem 7.8**.**

Proof 7.9**.**

7.3 Numerical experiments

7.3.1 Overview

7.3.2 Constant sampling lengths

Remark 7.10**.**

7.3.3 Mesh-dependent sampling lengths

8 Transverse vibrations of an isotropic membrane

8.1 Surrogate mass matrices

Corollary 8.1**.**

Proof 8.2**.**

8.2 A priori analysis of the eigenvalue error

Theorem 8.3**.**

Proof 8.4**.**

Remark 8.5**.**

8.3 Numerical experiments

9 Plate bending under a transverse load

9.1 A higher-dimensional kernel

9.2 Numerical experiments

Remark 9.1**.**

10 Stokes’ flow

10.1 Surrogate divergence matrices

Corollary 10.1**.**

Proof 10.2**.**

10.2 Numerical experiments

Appendix A Marsden’s identity

Theorem A.1** (Marsden).**

Corollary A.2**.**

Proof A.3**.**

Corollary A.4**.**

Remark 3.1.

Remark 3.2.

Lemma 4.1.

Remark 4.2.

Theorem 4.3.

Proof 4.4 (Proof of Theorem 4.3, under the assumption $W(\widehat{\bm{x}})=1$ ).

Remark 4.5.

Remark 5.1.

Proposition 5.2.

Proof 5.3.

Remark 6.1.

Lemma 7.1.

Proof 7.2.

Theorem 7.3.

Proof 7.4.

Remark 7.5.

Lemma 7.6.

Proof 7.7.

Theorem 7.8.

Proof 7.9.

Remark 7.10.

Corollary 8.1.

Proof 8.2.

Theorem 8.3.

Proof 8.4.

Remark 8.5.

Remark 9.1.

Corollary 10.1.

Proof 10.2.

Theorem A.1 (Marsden).

Corollary A.2.

Proof A.3.

Corollary A.4.

Proof A.5.

Proof A.6 (Proof of Theorem 4.3).