Kernel Multi-Grid on Manifolds

Thomas Hangelbroek; Christian Rieger

arXiv:2302.13039·math.NA·October 4, 2024

Kernel Multi-Grid on Manifolds

Thomas Hangelbroek, Christian Rieger

PDF

Open Access

TL;DR

This paper introduces a multigrid method for efficiently solving kernel-based PDEs on surfaces, achieving scalable linear algebra solutions with rigorous convergence analysis.

Contribution

It develops a geometric multigrid framework for kernel PDEs on manifolds, addressing computational challenges and providing proven convergence rates.

Findings

01

Linear system solving cost scales log-linear with degrees of freedom

02

Multigrid method accelerates kernel PDE solutions on surfaces

03

Rigorous analysis confirms convergence rates

Abstract

Kernel methods for solving partial differential equations on surfaces have the advantage that those methods work intrinsically on the surface and yield high approximation rates if the solution to the partial differential equation is smooth enough. Localized Lagrange bases have proven to alleviate the computational complexity of usual kernel methods to some extent, although the efficient numerical solution of the ill-conditioned linear systems of equations arising from kernel-based Galerkin solutions to PDEs has not been addressed in the literature so far. In this article we apply the framework of the geometric multigrid method with a $τ \geq 2$ -cycle to scattered, quasi-uniform point clouds on the surface. We show that the resulting linear algebra can be accelerated by using the Lagrange function decay, with convergence rates which are obtained by a rigorous analysis. In particular, we…

Equations283

A_{L} u_{L}^{⋆} = b_{L}

A_{L} u_{L}^{⋆} = b_{L}

O (N_{L} lo g (N_{L})^{d} lo g (\frac{ϵ _{m a x}}{∥ u - u ^{(0)} ∥}))

O (N_{L} lo g (N_{L})^{d} lo g (\frac{ϵ _{m a x}}{∥ u - u ^{(0)} ∥}))

α_{M} r^{d} \leq μ (B (x, r)) \leq β_{M} r^{d}

α_{M} r^{d} \leq μ (B (x, r)) \leq β_{M} r^{d}

q := \frac{1}{2} ζ \in Ξ min dist (ζ, Ξ ∖ {ζ}) and h := h (Ξ, M) := x \in M sup dist (x, Ξ) .

q := \frac{1}{2} ζ \in Ξ min dist (ζ, Ξ ∖ {ζ}) and h := h (Ξ, M) := x \in M sup dist (x, Ξ) .

\frac{μ ( M )}{β _{M}} h^{- d} \leq #Ξ \leq \frac{μ ( M )}{α _{M}} q^{- d} .

\frac{μ ( M )}{β _{M}} h^{- d} \leq #Ξ \leq \frac{μ ( M )}{α _{M}} q^{- d} .

\sum_{\zeta\in\Xi}f\bigl{(}\mathrm{dist}(\zeta,x)\bigr{)}\leq\max_{t\leq q}f(t)+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\sum_{n=1}^{\infty}(n+2)^{d}\max_{nq\leq t\leq(n+1)q}f(t).

\sum_{\zeta\in\Xi}f\bigl{(}\mathrm{dist}(\zeta,x)\bigr{)}\leq\max_{t\leq q}f(t)+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\sum_{n=1}^{\infty}(n+2)^{d}\max_{nq\leq t\leq(n+1)q}f(t).

⟨ T, S ⟩_{g, x} = i, j \in {1, \dots, d}^{k} \sum g^{i_{1} j_{1}} (x) \dots g^{i_{k}, j_{k}} (x) S_{i} (x) T_{j} (x) .

⟨ T, S ⟩_{g, x} = i, j \in {1, \dots, d}^{k} \sum g^{i_{1} j_{1}} (x) \dots g^{i_{k}, j_{k}} (x) S_{i} (x) T_{j} (x) .

\nabla f = \sum \frac{\partial f}{\partial x _{j}} d x^{j} = df .

\nabla f = \sum \frac{\partial f}{\partial x _{j}} d x^{j} = df .

\nabla^{*} ω = - \frac{1}{det g ( x )} j, k \sum \frac{\partial}{\partial x _{k}} (det g (x) g^{j k} (x) ω_{j} (x)) .

\nabla^{*} ω = - \frac{1}{det g ( x )} j, k \sum \frac{\partial}{\partial x _{k}} (det g (x) g^{j k} (x) ω_{j} (x)) .

∥ f ∥_{W_{p}^{k} (Ω)}^{p} = ℓ = 0 \sum k \int_{Ω} ∥ \nabla^{ℓ} f ∥_{g, x}^{p} d μ (x) < \infty.

∥ f ∥_{W_{p}^{k} (Ω)}^{p} = ℓ = 0 \sum k \int_{Ω} ∥ \nabla^{ℓ} f ∥_{g, x}^{p} d μ (x) < \infty.

Γ_{1} ∥ u \circ Exp_{x} ∥_{W_{p}^{j} (Ω)} \leq ∥ u ∥_{W_{p}^{j} (Exp_{x} (Ω))} \leq Γ_{2} ∥ u \circ Exp_{x} ∥_{W_{p}^{j} (Ω)} .

Γ_{1} ∥ u \circ Exp_{x} ∥_{W_{p}^{j} (Ω)} \leq ∥ u ∥_{W_{p}^{j} (Exp_{x} (Ω))} \leq Γ_{2} ∥ u \circ Exp_{x} ∥_{W_{p}^{j} (Ω)} .

∥ u ∥_{W_{2}^{k} (M)} \leq C_{Z} h^{m - k} ∥ u ∥_{W_{2}^{m} (M)} .

∥ u ∥_{W_{2}^{k} (M)} \leq C_{Z} h^{m - k} ∥ u ∥_{W_{2}^{m} (M)} .

L = - \nabla^{*} \textsc a_{2}^{♭} (\nabla \cdot) + \textsc a_{1} \nabla + \textsc a_{0} .

L = - \nabla^{*} \textsc a_{2}^{♭} (\nabla \cdot) + \textsc a_{1} \nabla + \textsc a_{0} .

\textsc a_{2} (v, v) \geq c_{0} g (v, v) and \textsc a_{0} + \frac{1}{2} \nabla^{*} \textsc a_{1} \geq c_{0} .

\textsc a_{2} (v, v) \geq c_{0} g (v, v) and \textsc a_{0} + \frac{1}{2} \nabla^{*} \textsc a_{1} \geq c_{0} .

\int_{M} u (\textsc a_{1} \nabla u) = \frac{1}{2} \int_{M} (\textsc a_{1} \nabla (u^{2})) = \frac{1}{2} \int_{Ω} (\nabla^{*} \textsc a_{1}) u^{2} ⟶ \int_{M} u (\textsc a_{1} \nabla u + \textsc a_{0} u) \geq c_{0} ∥ u ∥^{2} .

\int_{M} u (\textsc a_{1} \nabla u) = \frac{1}{2} \int_{M} (\textsc a_{1} \nabla (u^{2})) = \frac{1}{2} \int_{Ω} (\nabla^{*} \textsc a_{1}) u^{2} ⟶ \int_{M} u (\textsc a_{1} \nabla u + \textsc a_{0} u) \geq c_{0} ∥ u ∥^{2} .

c_{0} ∥ u ∥_{W_{2}^{1} (M)}^{2} \leq ∥ u ∥_{L}^{2} = a (u, u) \leq C ∥ u ∥_{W_{2}^{1} (M)}^{2} .

c_{0} ∥ u ∥_{W_{2}^{1} (M)}^{2} \leq ∥ u ∥_{L}^{2} = a (u, u) \leq C ∥ u ∥_{W_{2}^{1} (M)}^{2} .

a (u, v) = (f, v)_{L_{2} (M)} = F (v) for all v \in W_{2}^{1} (M) .

a (u, v) = (f, v)_{L_{2} (M)} = F (v) for all v \in W_{2}^{1} (M) .

a (P_{V_{h}} u, v) = a (u, v) for all v \in V_{h} .

a (P_{V_{h}} u, v) = a (u, v) for all v \in V_{h} .

c_{1} ∥ u - P_{V_{h}} u ∥_{W_{2}^{1} (M)}^{2}

c_{1} ∥ u - P_{V_{h}} u ∥_{W_{2}^{1} (M)}^{2}

\leq c_{2} ∥ u - P_{V_{h}} u ∥_{W_{2}^{1} (M)} ∥ u - v ∥_{W_{2}^{1} (M)} for all v \in V_{h} .

∥ u - P_{V_{h}} u ∥_{W_{2}^{1} (M)} \leq \frac{c _{2}}{c _{1}} dist_{W_{2}^{1} (M)} (u, V_{h}) .

∥ u - P_{V_{h}} u ∥_{W_{2}^{1} (M)} \leq \frac{c _{2}}{c _{1}} dist_{W_{2}^{1} (M)} (u, V_{h}) .

∥ u - P_{V_{h}} u ∥_{L_{2} (M)} \leq C h dist_{∥ \cdot ∥_{W_{2}^{1} (M)}} (u, V_{h}) .

∥ u - P_{V_{h}} u ∥_{L_{2} (M)} \leq C h dist_{∥ \cdot ∥_{W_{2}^{1} (M)}} (u, V_{h}) .

V_{Ξ} := ⎩ ⎨ ⎧ ξ \sum a (ξ) ϕ (\cdot, ξ) ∣ (\forall p \in Π) ξ \in Ξ \sum a_{ξ} p (ξ) = 0 ⎭ ⎬ ⎫ + Π

V_{Ξ} := ⎩ ⎨ ⎧ ξ \sum a (ξ) ϕ (\cdot, ξ) ∣ (\forall p \in Π) ξ \in Ξ \sum a_{ξ} p (ξ) = 0 ⎭ ⎬ ⎫ + Π

dist_{W_{2}^{k} (M)} (u, V_{Ξ}) \leq C h^{j - k} ∥ u ∥_{W_{2}^{j} (M)} .

dist_{W_{2}^{k} (M)} (u, V_{Ξ}) \leq C h^{j - k} ∥ u ∥_{W_{2}^{j} (M)} .

∥ I_{Ξ} u - u ∥_{W_{2}^{k} (M)} \leq C h^{m - k} ∥ I_{Ξ} u - u ∥_{W_{2}^{m} (M)}

∥ I_{Ξ} u - u ∥_{W_{2}^{k} (M)} \leq C h^{m - k} ∥ I_{Ξ} u - u ∥_{W_{2}^{m} (M)}

∥ I_{Ξ} u - u ∥_{W_{2}^{k} (M)} \leq C h^{m - k} ∥ u ∥_{W_{2}^{m} (M)} .

∥ I_{Ξ} u - u ∥_{W_{2}^{k} (M)} \leq C h^{m - k} ∥ u ∥_{W_{2}^{m} (M)} .

K (\tilde{u}, t) = g \in W_{2}^{m} (M) in f ∥ \tilde{u} - g ∥_{W_{2}^{k} (M)} + t ∥ g ∥_{W_{2}^{m} (M)}

K (\tilde{u}, t) = g \in W_{2}^{m} (M) in f ∥ \tilde{u} - g ∥_{W_{2}^{k} (M)} + t ∥ g ∥_{W_{2}^{m} (M)}

∥ \tilde{u} - g ∥_{W_{2}^{k} (M)}

∥ \tilde{u} - g ∥_{W_{2}^{k} (M)}

∥ u - P_{Ξ} u ∥_{L_{2} (M)} \leq C h ∥ u ∥_{W_{2}^{1} (M)}

∥ u - P_{Ξ} u ∥_{L_{2} (M)} \leq C h ∥ u ∥_{W_{2}^{1} (M)}

∥ χ_{ξ} ∥_{W_{2}^{m} (M ∖ B (ξ, R))} \leq C_{e n} q^{\frac{d}{2} - m} e^{- ν \frac{R}{h}} .

∥ χ_{ξ} ∥_{W_{2}^{m} (M ∖ B (ξ, R))} \leq C_{e n} q^{\frac{d}{2} - m} e^{- ν \frac{R}{h}} .

∣ χ_{ξ} (x) ∣ \leq C_{pw} e^{- ν \frac{dist ( x , ξ )}{h}}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Numerical Analysis Techniques · Advanced Numerical Methods in Computational Mathematics · 3D Shape Modeling and Analysis

Full text

Kernel Multi-Grid on Manifolds††thanks: \fundingThomas Hangelbroek’s research was supported by grant DMS-2010051 from the National

Science Foundation

Thomas Hangelbroek Department of Mathematics, University of Hawai‘i – Mānoa, Honolulu, HI 96822, USA. (). [email protected]

Christian Rieger Philipps-Universität Marburg, Department of Mathematics and Computer Science, Hans-Meerwein-Straße 6, 35032 Marburg () [email protected]

Abstract

Kernel methods for solving partial differential equations on surfaces have the advantage that those methods work intrinsically on the surface and yield high approximation rates if the solution to the partial differential equation is smooth enough. Naive implementations of kernel based methods suffer, however, from the cubic complexity in the degrees of freedom. Localized Lagrange bases have proven to overcome this computational complexity to some extent. In this article we present a rigorous proof for a geometric multigrid method with $\tau\geq 2$ -cycle for elliptic partial differential equations on surfaces which is based on precomputed Lagrange basis functions. Our new multigrid provably works on quasi-uniform point clouds on the surface and hence does not require a grid-structure. Moreover, the computational cost scales log-linear in the degrees of freedom.

keywords:

Geometric multi-grid, partial differential equations on manifolds, kernel-based Galerkin methods, localized Lagrange basis

{MSCcodes}

65F10, 65Y20, 65M12, 65M15, 65M55, 65M60

1 Introduction

The numerical solution to partial differential equations on curved geometries is crucial to many real-world applications. Of the many numerical methods (see for instance [2] for an overview using finite elements, or [13] [10] for alternative mesh-free methods), we focus on mesh-free kernel-based Galerkin methods, which have a number of merits, including delivering high approximation orders for smooth data, providing smooth solutions, and working coordinate-free and without the need for rigid underlying geometric structures like meshes and regular grids. The major drawback of plain kernel methods is however their prohibitively high numerical costs.

In this paper we address the problem of reducing the computational costs of using kernels without spoiling their analytic advantages. It has been noticed that in many cases, there is a localized basis for kernel based trial spaces (essentially a Lagrange-type basis or nodal basis in the finite element context). We use the localized basis to provide a rigorous analysis of classical multi-grid methods, precisely the $W$ -cycle for second order linear elliptic differential equations on compact Riemannian manifolds without boundary. Although multi-grid methods recently have gained attention also in the mesh-less community, see [9] and [16], the rigorous analysis we provide has been missing in the kernel-based context.

Throughout this article, we will assume to have access to a sequence of quasi-uniform refined point clouds on the manifold, $\mathbb{M}$ and its corresponding Lagrange basis. Both, the construction of point clouds on manifolds and the computation of the Lagrange basis are independent of the partial differential equation and can hence be pre-computed and even be stored. Precisely, we will assume to have a nested sequence of sets $\Xi_{0}\subset\Xi_{1}\subset\dots\Xi_{L}\subset\mathbb{M}$ and we denote the finest kernel-based Galerkin problem by

[TABLE]

where $\boldsymbol{\mathbf{A}}_{L}$ (please refer to Eq. 18 for a precise definition) is the matrix which represents the elliptic operator $\mathcal{L}$ on a finite dimensional kernel space expressed in that Lagrange basis, as considered in [3] or [4]. The corresponding approximation takes the form $u_{L}\in V_{\Xi_{L}}$ (for a definition of the space, please refer to Eq. 11). We will stick to the convention that boldface letters denote quantities from linear algebra whereas the non-boldface letters denote functions.

We point out that Eq. 1 is usually a densely populated linear system and the condition number $\operatorname{cond}_{2}(\boldsymbol{\mathbf{A}}_{L})\sim N^{2/d}_{L}$ is growing with the problem size. Thus iterative methods cannot be expected to work well on this linear system, as the iterations needed to ensure a prescribed accuracy will grow with the number of degrees of freedom.

The main novelty of this paper is that we define suitable smoothing, restriction and prolongation operators and define a mesh-free multi-grid algorithm based on those choices which can be shown to be a contraction with grid-size independent norm, see Theorem 5.9.

Furthermore, by carefully truncating the various rapidly decaying matrices (i.e., stiffness, restriction and prolongation matrices) which appear in the algorithm, we have an approximate solution to Eq. 1) which requires only $\mathcal{O}\bigl{(}N_{L}\log(N_{L})^{d}\bigr{)}$ floating point operations per iteration and where again the multi-grid matrix has a norm bound less than unity (see Theorem 6.6), for a total operation count of

[TABLE]

to achieve error $\|\boldsymbol{\mathbf{u}}-\boldsymbol{\mathbf{u}}_{L}^{\star}\|_{\ell_{2}}\leq\epsilon+C\log(N_{L})^{d}N_{L}^{-J}$ , where $J>0$ is a user-determined constant which depends linearly on the sparsity of truncated matrices in the algorithm, see Remark 7.9. A final application of the triangle inequality also implies an error bound for the function reconstruction of the true solution to the partial differential equation by its numerically obtained approximations.

The remainder of this article is organized as follows. In Section 2, we introduce the basic notation of second order elliptic equations on manifolds, and their solution via kernel-based Galerkin approximation. In this section we demonstrate the approximation property in the kernel context, which, along with the smoothing property, provides the analytic backbone for the success of the multi-grid method. Section 3 introduces the phenomenon of rapidly decaying Lagrange-type bases, which holds for certain kernels – using such bases permits stiffness matrices with rapid off-diagonal decay, among other things. Of special importance is the diagonal behavior of the stiffness matrix given in Lemma 3.5, which is a novel contribution of this paper, and which is the analytic result necessary to prove the smoothing property. In Section 4 we discuss the smoothing property of damped Jacobi iterations for kernel-based methods both in the case of symmetric and non-symmetric differential operators. In Section 5 we introduce kernel-based restriction and prolongation operators, the classical two-grid method and then the $W$ -cycle. In this section, we prove Theorem 5.9, a consequence of which is Eq. 42. Section 6 treats the error resulting from small perturbations of the stiffness matrix, as well as the prolongation and restriction matrices. The main result in this section, Theorem 6.6, demonstrates how such errors affect the multi-grid approximation error. Section 7 investigates the computationally efficient truncated multi-grid method as an application of the previous section.

2 Problem Set Up

2.1 Manifold

Consider a compact Riemannian manifold without boundary $\mathbb{M}$ . Here we list some useful tools and their properties which hold in this setting; the relevant results are all taken from [6]. The metric tensor at $x\in\mathbb{M}$ is denoted $g(x)$ . For tangent vectors in $V,W\in T_{x}\mathbb{M}$ we have $\langle V,W\rangle_{g,x}=\sum V^{j}W^{k}g_{j,k}$ . For cotangent vectors $\mu,\nu\in T_{x}^{*}\mathbb{M}$ we have $\langle\mu,\nu\rangle_{g,x}=\sum\mu_{j}\nu_{k}g^{j,k}$ , where $\sum g_{j,k}g^{k,\ell}=\delta_{j,\ell}$ .

The Riemannian metric gives rise to a volume form $\mathrm{d}\mu=\sqrt{\det(g(x))}\mathrm{d}x$ , and by compactness, there exist constants $0<\alpha_{\mathbb{M}}\leq\beta_{\mathbb{M}}$ so that

[TABLE]

for any $x\in\mathbb{M}$ and $0<r<\mathrm{diam}(\mathbb{M})$ .

Because $\mathbb{M}$ is a metric space, for a finite subset $\Xi\subset\mathbb{M}$ , we can define the following useful quantities: Define the the separation distance, $q$ of $\Xi$ and the fill distance, $h$ , of $\Xi$ in $\mathbb{M}$ as

[TABLE]

The finite sets considered throughout this paper will be quasiuniform, with mesh ratio $\rho:=h/q$ bounded by a fixed constant. For this reason, quantities which are controlled above or below by a power of $q$ can be likewise controlled by a power of $h$ – in short, whenever possible, we express estimates in terms of the fill distance $h$ , allowing constants to depend on $\rho$ . For instance, the cardinality $\#\Xi$ is bounded above and below by

[TABLE]

Similarly, if $f:[0,\infty)\to\mathbb{R}$ is continuous then for any $x\in\mathbb{M}$ ,

[TABLE]

2.2 Sobolev spaces

There is a natural inner product on each fiber of the tensor bundle $T_{s}^{r}\mathbb{M}$ . To define Sobolev spaces, we will be concerned with covariant tensor fields: sections of $T_{k}\mathbb{M}$ . Given vector fields $\mathbf{T},\mathbf{S}\in\mathcal{T}_{k}\mathbb{M}$ , written in coordinates as $\mathbf{T}=\sum_{\boldsymbol{\mathbf{j}}\in\{1,\dots,d\}^{k}}T_{\boldsymbol{\mathbf{j}}}\,d{x^{j_{1}}}\dots d{x^{j_{k}}}$ and $\mathbf{S}=\sum_{\boldsymbol{\mathbf{j}}\in\{1,\dots,d\}^{k}}S_{\boldsymbol{\mathbf{j}}}\,d{x^{j_{1}}}\dots d{x^{j_{k}}}$ , at each point $x\in\mathbb{M}$ we have

[TABLE]

The covariant derivative $\nabla$ maps tensor fields of rank $(r,s)$ to fields of rank $(r,s+1)$ . Its adjoint (with respect to the $L_{2}$ inner products on the space of sections of $T_{r}^{s}(\mathbb{M})$ ) is denoted $\nabla^{*}$ . For functions, this is fairly elementary. The covariant derivative of a function $f:\mathbb{M}\to\mathbb{R}$ equals its exterior derivative; in coordinates, we have

[TABLE]

For a 1-form $\omega=\sum\omega_{j}dx^{j}$ , we have

[TABLE]

The Sobolev space $W_{p}^{k}(\Omega)$ is defined to be the set of functions $f:\Omega\to\mathbb{R}$ which satisfy

[TABLE]

Lemma 3.2 from [6] applies, so there are uniform constants $\Gamma_{1},\Gamma_{2}$ and $r>0$ so that the family of exponential maps $\{\mathrm{Exp}_{x}:B_{r}\to\mathbb{M}\mid x\in\mathbb{M}\}$ (which are diffeomorphisms taking [math] to $x$ ) provides local metric equivalences: for any open set $\Omega\subset B_{r}$ , we have

[TABLE]

This shows that $W_{p}^{k}(\mathbb{M})$ can be endowed with equivalent norms using a partition of unity $(\varphi_{j})_{j\leq N}$ subordinate to a cover $\{\mathcal{O}_{j}\}_{j\leq N}$ with associated charts $\psi_{j}:\mathcal{O}_{j}\to\mathbb{R}^{d}$ to obtain $\|u\|_{W_{p}^{k}(\mathbb{M})}^{p}\sim\sum_{j=1}^{N}\|(\varphi_{j}u)\circ\psi_{j}^{-1}\|_{W_{p}^{k}(\mathbb{R}^{d})}^{p}$ . Here constants of equivalence depend on the partition of unity and charts.

A useful result in this setting, which we will use explicitly in this article, but is also behind a number of background results in Section 2.5, is the following zeros estimate [7, Corollary A.13 ], which holds for Sobolev spaces: If $u\in W_{2}^{m}(\mathbb{M})$ satisfies $u\left|{}_{X}\right.=0$ , then

[TABLE]

The result can also be obtained on bounded, Lipschitz domain $\Omega\subset\mathbb{M}$ that satisfies a uniform cone condition, with the cone having radius $R_{\Omega}\leq r_{\mathbb{M}}/3$ , see [7, Theorem A.11 ] for a precise statement and definitions of the involved quantities.

2.3 Elliptic operator

We write the operator $\mathcal{L}$ in divergence form:

[TABLE]

Here $\textsc{a}_{0}$ is a smooth function, $\textsc{a}_{1}$ is a smooth tensor field of type $(1,0)$ and $\textsc{a}_{2}$ is a smooth tensor field of type $(2,0)$ which generates the field $\textsc{a}_{2}^{\flat}$ type (1,1). In coordinates, $\textsc{a}_{2}$ has the form $a^{jk}\frac{\partial}{\partial x_{j}}\frac{\partial}{\partial x_{k}}$ , and $\textsc{a}_{2}^{\flat}$ is $a_{k}^{j}dx_{k}\frac{\partial}{\partial x_{j}}$ with $a_{k}^{j}=\sum_{\ell}g_{k\ell}a^{\ell j}$ . Furthermore, we require $c_{0}>0$ to be a constant so that

[TABLE]

The latter ensures that

[TABLE]

Thus, Eq. 6 guarantees that the bilinear form $a(u,v):=\int v\mathcal{L}u$ , defined initially for smooth functions, is bounded on $W_{2}^{1}(\mathbb{M})$ and is coercive. Thus, we have that the energy norm $\|u\|^{2}_{\mathcal{L}}:=a(u,u)$ satisfies the metric equivalence

[TABLE]

2.4 Galerkin methods

We fix an $f\in L_{2}(\mathbb{M})$ and consider $u\in W^{1}_{2}(\mathbb{M})$ as solution to

[TABLE]

Regularity estimates ([14, Chapter 5.11, Theorem 11.1]) yield that $u\in W^{2}_{2}(\mathbb{M})$ and $\|u\|_{W^{2}_{2}(\mathbb{M})}\leq C\|f\|_{L_{2}(\mathbb{M})}$ .

Consider now a family of finite dimensional subspaces $(V_{h})$ with $V_{h}\subset W_{2}^{1}(\mathbb{M})$ associated to a parameter111In the sequel, these will be kernel spaces generated by a subset $\Xi\subset\mathbb{M}$ , and $h=h(\Xi,\mathbb{M})$ will be the fill distance. For now, we consider a more abstract setting, with $h>0$ only playing a role in establishing the approximation property $\mathrm{dist}_{W_{2}^{1}}(g,V_{h})\leq Ch\|g\|_{W_{2}^{2}(\mathbb{M})}$ below. $h>0$ . Define $P_{V_{h}}:W^{1}_{2}(\mathbb{M})\to V_{h}$ so that

[TABLE]

The classical Céa Lemma yields

[TABLE]

And thus we get

[TABLE]

This can be improved by a Nitsche-type argument to get the following result, whose proof can be found in many textbooks on numerical methods for partial differential equations.

Lemma 2.1.

Suppose the family $(V_{h})$ has the property that for all $\tilde{u}\in W_{2}^{2}(\mathbb{M})$ , the distance in $W_{2}^{1}(\mathbb{M})$ from $V_{h}$ satisfies $\mathrm{dist}_{\|\cdot\|_{W^{1}_{2}(\mathbb{M})}}\left(\tilde{u},V_{h}\right)\leq Ch\left\|\tilde{u}\right\|_{W^{2}_{2}(\mathbb{M})}$ . Then for any $u\in W_{2}^{2}(\mathbb{M})$ ,

[TABLE]

We point out, that Lemma 2.1 does not depend on the choice of a specific basis in the finite dimensional space $V_{h}$ .

2.5 Kernel approximation

We consider a continuous function $\phi:\mathbb{M}^{2}\to\mathbb{R}$ , the kernel, which satisfies a number of analytic properties which we explain in this section.

Most important is that $k$ is conditionally positive definite with respect to some (possibly trivial) finite dimensional subspace222generally a space spanned by some eigenfunctions of the Laplace-Beltrami operator $\Pi\subset C^{\infty}(\mathbb{M})$ . Conditional positive definiteness with respect to $\Pi$ means that for any $\Xi\subset\mathbb{M}$ , the collocation matrix $\bigl{(}\phi(\xi,\zeta)\bigr{)}_{\xi,\zeta}$ is positive definite on the vector space $\{a\in\mathbb{R}^{\Xi}\mid(\forall p\in\Pi)\,\sum_{\xi\in\Xi}a_{\xi}p(\xi)=0\}$ . As a result, if $\Xi$ separates elements of $\Pi$ , then the space

[TABLE]

has dimension $\#\Xi$ . In case $\Pi=\{0\}$ , the kernel is positive definite, and the collocation matrix is strictly positive definite on $\mathbb{R}^{\Xi}$ , and $V_{\Xi}=\mathrm{span}_{\xi\in\Xi}\phi(\cdot,\xi)$ .

For a conditionally positive definite kernel, there is an associated reproducing kernel semi-Hilbert space $\mathcal{N}(k)\subset C(\mathbb{M})$ with the property that $\Pi=\mathrm{null}(\|\cdot\|_{\mathcal{N}})$ and that for any $a\in\mathbb{R}^{\Xi}$ for which $\sum_{\xi\in\Xi}a_{\xi}\delta_{\xi}\perp\Pi$ , the identity $\sum a_{\xi}f(\xi)=\langle f,\sum a_{\xi}\phi(\cdot,\xi)\rangle_{\mathcal{N}}$ holds for all $f\in\mathcal{N}$ . It follows that if $\Xi$ separates elements of $\Pi$ , interpolation with $V_{\Xi}$ is well defined on $\Xi$ and the projection $I_{\Xi}:\mathcal{N}\to V_{\Xi}$ is orthogonal with respect to the semi-norm on $\mathcal{N}$ .

Of special interest are (conditionally) positive definite kernels have Sobolev native spaces.

Lemma 2.2.

If $\phi$ is conditionally positive definite with respect to $\Pi$ and satisfies the equivalence $\mathcal{N}/\Pi\cong W_{2}^{m}(\mathbb{M})/\Pi$ . then there is a constant $C$ so that for any integers $k,j$ with $0\leq k\leq j\leq m$ and any $u\in W_{2}^{j}(\mathbb{M})$ ,

[TABLE]

Proof 2.3.

The case $k=j$ is trivial; it follows by considering $0\in V_{\Xi}$ .

For $j=m$ and $0\leq k<m$ , the zeros estimate [7, Corollary A.13] ensures that

[TABLE]

(because the interpolation error $I_{\Xi}f-f$ vanishes on $\Xi$ ). The hypothesis then gives, $\|I_{\Xi}u-u\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{m-k}\|I_{\Xi}u-u\|_{\mathcal{N}}$ , with a suitably enlarged constant. Because $I_{\Xi}$ is an orthogonal projector, $\|I_{\Xi}u-u\|_{\mathcal{N}}\leq\|u\|_{\mathcal{N}}$ , so we have (again enlarging the constant) that

[TABLE]

In case $0\leq k<j<m$ , we use the fact that $W_{2}^{j}(\mathbb{M})=[W_{2}^{k}(\mathbb{M}),W_{2}^{m}(\mathbb{M})]_{\frac{j-k}{m-k},2}$ . This is [15, Theorem 5]. For $\tilde{u}\in W_{2}^{j}(\mathbb{M})$ , this means that the $K$ -functional

[TABLE]

satisfies the condition $\int_{0}^{\infty}(t^{-\frac{j-k}{m-k}}K(\tilde{u},t))^{2}\frac{\mathrm{d}t}{t}<\infty$ . Since $K(\tilde{u},t)$ is continuous and monotone, we have that $t\mapsto t^{-\frac{j-k}{m-k}}K(\tilde{u},t)$ is bounded on $(0,\infty)$ . Thus for $t=h^{m-k}$ , there exists $g\in W_{2}^{m}(\mathbb{M})$ so that

[TABLE]

*Let $s=I_{\Xi}g$ . The above estimate gives $\|I_{\Xi}g-g\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{m-k}\|g\|_{W_{2}^{m}(\mathbb{M})}$ . This implies that $\|\tilde{u}-I_{\Xi}g\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{j-k}\|\tilde{u}\|_{W_{2}^{j}(\mathbb{M})}$ as desired. *

As a consequence, the kernel Galerkin solution $u_{\Xi}\in V_{\Xi}$ to $\mathcal{L}u=f$ with $f\in W_{2}(\mathbb{M})$ satisfies $\|u-u_{\Xi}\|_{W_{2}^{1}(\mathbb{M})}\leq Ch^{j-1}\|f\|_{W_{2}^{j+1}(\mathbb{M})}$ . More importantly, for our purposes, the hypothesis of Lemma 2.1 is satisfied by the space $V_{\Xi}$ . Indeed, by Eq. 10, we have the following approximation property:

[TABLE]

which, together with a smoothing property, forms the backbone of the convergence theory for the multigrid method.

3 The Lagrange basis and stiffness matrix

For a kernel $\phi$ and a set $\Xi$ which separates points of $\Pi$ , the Lagrange basis $(\chi_{\xi})_{\xi\in\Xi}$ for $V_{\Xi}$ satisfies $\chi_{\xi}(\zeta)=\delta_{\xi,\zeta}$ for each $\xi,\zeta\in\Xi$ .

A natural consequence of $\mathcal{N}\cong W_{2}^{m}(\mathbb{M})$ is that there exists a constant $C$ so that for any $\Xi\subset\mathbb{M}$ , the bound $\|\chi_{\xi}\|_{W_{2}^{m}(\mathbb{M})}\leq Cq^{\frac{d}{2}-m}$ holds. A stronger result is the following:

Assumption 1.

We assume that there is $m>d/2+1$ so that $\mathcal{N}\cong W_{2}^{m}(\mathbb{M})$ , and, furthermore, there exist constants $\nu>0$ and $C_{en}$ so that for $R>0$

[TABLE]

This gives rise to a number of analytic properties, some of which we present here (there are many more, see [5] and [4] for a detailed discussions). For the following estimates, the constants of equivalence depend on $C_{en}$ , $\nu$ , and $\rho$ .

Pointwise decay: there exist constants $C$ and $\nu>0$ so that for any $\Xi\subset\mathbb{M}$ , the estimate

[TABLE]

holds. Here $C_{pw}\leq\rho^{m-d/2}C_{en}$ , where we recall that the mesh ratio is $\rho=h/q$ .

Hölder continuity: Of later importance, we mention the following condition, which follows from [7, Corollary A.15]. For any $\epsilon<m-d/2$ , the Lagrange function is $\epsilon$ Hölder continuous, and satisfies the bound

[TABLE]

Although we make explicit dependence on it here, for the remainder of this article, we assume constants to depend on $\rho$ .

Riesz property: there exist constants $0<C_{1}\leq C_{2}<\infty$ so that for any $\Xi\subset\mathbb{M}$ , and $a\in\mathbb{R}^{\Xi}$ , we have

[TABLE]

Bernstein inequalities: There is a constant $C_{\mathfrak{B}}$ so that for $0\leq k\leq m$ ,

[TABLE]

3.1 The stiffness matrix

We now discuss the stiffness matrix and some of its properties. Most of these have appeared in [1], with earlier versions for the sphere appearing in [6].

A consequence of the results of this section is that the problem of calculating the Galerkin solution to $\mathcal{L}u=f$ from $V_{\Xi}$ involves treating a problem whose condition number grows like $\mathcal{O}(h^{-2})$ – this is the fundamental issue that the multi-grid method seeks to overcome.

The analysis map for $\bigl{(}\chi_{\xi}\bigr{)}_{\xi\in\Xi_{\ell}}$ with respect to the bilinear form $a$ is

[TABLE]

The analysis map is a surjection.

The synthesis map is

[TABLE]

The range of the synthesis map is clearly $V_{\Xi}$ ; in other words, it is the natural isomorphism between Euclidean space and the finite dimensional kernel space; indeed, Eq. 15 shows that it is bounded above and below between $L_{2}(\mathbb{M})$ and $\ell_{2}(\Xi)$ . By abusing notation slightly, we write $\bigl{(}\sigma^{\ast}_{\Xi}\bigr{)}^{-1}:V_{\Xi}\to\mathbb{R}^{\Xi}$ . This permits a direct matrix representation of linear operators on $V_{\Xi}$ via conjugation: $S\mapsto\mathbf{S}:=(\sigma_{\Xi}^{*})^{-1}S\sigma_{\Xi}^{*}\in\mathbb{R}^{\Xi\times\Xi}$ . Furthermore, by the Riesz property Eq. 15, we have

[TABLE]

A simple calculation shows that $a\left(\sigma^{\ast}_{\Xi}(\boldsymbol{\mathbf{w}}),v)\right)=\left\langle\boldsymbol{\mathbf{w}},\sigma_{\Xi}(v)\right\rangle_{2}$ , so $\sigma_{\Xi}^{\ast}$ is the $a$ -adjoint of $\sigma_{\Xi}$ . Of course, when $a$ is symmetric, we also have $\left\langle\sigma_{\ell}(v),\boldsymbol{\mathbf{w}}\right\rangle_{2}=a\left(v,\sigma^{\ast}_{\ell}(\boldsymbol{\mathbf{w}})\right)$ .

The stiffness matrix is defined as

[TABLE]

It represents the operator $\mathcal{L}$ on the finite dimensional space $V_{\Xi}$ . Using the analysis and synthesis maps, $\boldsymbol{\mathbf{A}}_{\Xi}=\sigma_{\Xi}\circ\sigma^{\ast}_{\Xi}:\left(\mathbb{R}^{\Xi},(\cdot,\cdot)_{2}\right)\to\left(\mathbb{R}^{\Xi},(\cdot,\cdot)_{2}\right):\boldsymbol{\mathbf{c}}\mapsto\left(a(\chi_{\xi},\chi_{\zeta})\right)_{\xi,\zeta\in\Xi}\boldsymbol{\mathbf{c}}$ , and the Galerkin projector $P_{\Xi}$ satisfies $P_{\Xi}=\sigma^{\ast}_{\Xi}\left(\sigma_{\Xi}\circ\sigma^{\ast}_{\Xi}\right)^{-1}\sigma_{\Xi}:W_{2}^{1}(\mathbb{M})\to V_{\Xi}$ .

Lemma 3.1.

There is a constant $C_{stiff}$ so that the entries of the stiffness matrix satisfy

[TABLE]

Proof 3.2.

We can bound the integral

[TABLE]

Decompose each inner product using the half spaces $H_{+}:=\{x\mid\mathrm{dist}(x,\xi)<\mathrm{dist}(x,\eta)\}$ and $H_{-}=\mathbb{M}\setminus H_{+}$ , noting $H_{+}\subset\mathbb{M}\setminus B(\eta,R)$ and $H_{-}\subset\mathbb{M}\setminus B(\xi,R)$ , with $R=\mathrm{dist}(\xi,\eta)/2$ . By applying Cauchy-Schwartz to each integral gives, after combining terms,

[TABLE]

for some constant $C_{\mathcal{L}}$ depending on the coefficients of $\mathcal{L}$ .

We have $\|\chi_{\xi}\|_{W_{2}^{1}(\mathbb{M})}\leq C_{\mathfrak{B}}h^{d/2-1}$ by the Bernstein inequality Eq. 16 (with a similar estimate for $\chi_{\eta}$ ). The zeros estimate for complements of balls, [7, Corollary A.17], applied to $\chi_{\xi}$ gives $\|\chi_{\xi}\|_{W_{2}^{1}(\mathbb{M}\setminus B(\xi,R))}\leq C_{Z}h^{m-1}\|\chi_{\xi}\|_{W_{2}^{m}(\mathbb{M}\setminus B(\xi,R))}$ (with a similar estimate for $\chi_{\eta}$ ). Thus we have

[TABLE]

*The lemma follows with $C_{stiff}=2C_{\mathcal{L}}C_{\mathfrak{B}}C_{Z}C_{en}\rho^{m-d/2}$ . *

By considering row and column sums, that

[TABLE]

with $C_{A}=C_{stiff}(1+\sum_{n=1}^{\infty}(n+2)^{d}e^{-\frac{\nu}{2\rho}n})$ .

Lemma 3.3.

Under Assumptions 1, for $\Xi\subset\mathbb{M}$ , the stiffness matrix satisfies

[TABLE]

*with a constant $C_{inv}$ which is independent of $\Xi$ . *

Proof 3.4.

Coercivity ensures $|a(\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi},\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi})|\geq c_{0}\left\|\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi}\right\|^{2}_{W_{2}^{1}(\mathbb{M})}$ , and the metric equivalence $\|v\|_{W_{2}^{1}(\mathbb{M})}^{2}=\|(1-\Delta)^{1/2}v\|_{L_{2}(\mathbb{M})}$ gives

[TABLE]

is the stiffness matrix for the self-adjoint operator $1-\Delta$ . Because $\sigma(1-\Delta)\subset[1,\infty)$ , we conclude that $\boldsymbol{\mathbf{v}}\cdot\boldsymbol{\mathbf{L}}\boldsymbol{\mathbf{v}}\geq\left\|\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi}\right\|^{2}_{L_{2}(\mathbb{M})}\geq C_{1}h^{d}\left\|\boldsymbol{\mathbf{v}}\right\|^{2}_{\ell_{2}(\Xi)}$ by the Riesz property Eq. 15. Hence, overall we get

[TABLE]

*with $c=c_{0}C_{1}$ , so $\|\boldsymbol{\mathbf{A}}_{\Xi}^{-1}\|_{\ell_{2}\to\ell_{2}}\leq\frac{1}{c_{0}C_{1}}h^{-d}$ . *

Consequently, the $\ell_{2}$ condition number of $\boldsymbol{\mathbf{A}}_{\Xi}$ is bounded by a multiple of $h^{-2}$ .

3.2 The diagonal of the stiffness matrix

As a counterpart to the off-diagonal decay given in Lemma 3.1, we can give the following lower bounds on the diagonal entries.

Lemma 3.5.

For an elliptic operator $\mathcal{L}$ and a kernel satisfying Assumption 1, for a mesh ratio $\rho$ , there is $C_{diag}>0$ so that for any $\Xi\subset\mathbb{M}$ with $h/q<\rho$ , and any $\xi\in\Xi$ , we have

[TABLE]

Proof 3.6.

By coercivity of $a$ , it suffices to prove that $\|\nabla\chi_{\xi}\|_{L_{2}(B(\xi,h))}^{2}\gtrsim h^{d-2}.$

We begin by establishing a Poincaré-type inequality which is valid for smooth Lagrange functions. To this end, consider $f:B(0,\mathrm{r}_{\mathbb{M}})\to\mathbb{R}$ obtained by the change of variable $f(x)=\chi_{\xi}(\exp_{\xi}(x))$ . Note that for $B\subset B(\xi,\mathrm{r}_{\mathbb{M}})$ , we have for any $0\leq k\leq m$ , that

[TABLE]

by the zeros estimate, with $C_{1}:=\frac{1}{\Gamma_{1}}\rho^{m-d/2}C_{en}C_{Z}$ , where the constants stem from Eq. 4. Now let $r:=\mathrm{dist}(\xi,\Xi\setminus\{\xi\})$ , and define $F:B(0,1)\to\mathbb{R}$ by $F:=f(r\cdot)$ . Then, by a change of variable,

[TABLE]

with $C_{2}:=\rho^{d}C_{1}$ , since $h/\rho\leq q\leq r\leq h$ .

Because of the embedding $W_{2}^{m}(B(0,1))\subset C(\overline{B(0,1)})$ (which holds since $m>d/2$ ), the set $K:=\left\{G\in W_{2}^{m}(B(0,1))\mid\|G\|_{W_{2}^{m}(B(0,1))}\leq C_{2},\,G(0)=1,G(e_{1})=0\right\}$ is well defined, closed and convex, hence weakly compact, by Banach-Alaoglu.

The natural embedding $e:W_{2}^{m}(B(0,1))\to W_{2}^{1}(B(0,1))$ , is compact. We wish to show that $eK$ is a compact set.

Because $e$ is a continuous linear map, it is continuous between the weak topologies of $W_{2}^{m}(B(0,1))$ and $W_{2}^{1}(B(0,1))$ . Thus $e(K)$ is weakly compact in $W_{2}^{1}(B(0,1))$ , and thus norm closed. Finally, because $e(K)$ is complete and totally bounded, it is a compact subset in the norm topology of $W_{2}^{1}(B(0,1))$ .

Consider the (possibly zero) constant $c$ defined by

[TABLE]

The map $I:W_{2}^{1}(B(0,1))\to\mathbb{R}:G\mapsto\frac{\|\nabla G\|_{L_{2}}}{\|G\|_{L_{2}}}$ is continuous on the complement of $0\in W_{2}^{1}(B(0,1))$ (as quotient of two continuous functions that do not vanish). In particular, it is continuous and non-vanishing on $e(K)$ , so $c=\min_{G\in e(K)}I(G)>0$ . Indeed, $I(G)>0$ for all $G\in e(K)$ , since $G(0)=1$ , $G(e_{1})=0$ and $G\in C^{1}(B)$ .

Note that in the above minimization problem, the condition $G(e_{1})=0$ could be replaced by any other point on the unit circle without changing the value of $c$ . By rotation invariance of the $W_{2}^{m}(\mathbb{R}^{d})$ norm, it follows that $\|\nabla F\|_{L_{2}}\geq c\|F\|_{L_{2}}$ . Finally, employing the change of variables $r^{d-2}\|\nabla F\|_{L_{2}(B(0,1))}^{2}=\|\nabla f\|_{L_{2}(B(0,r))}^{2}$ and $\|F\|_{L_{2}(B(0,1))}^{2}=r^{d}\|f\|_{L_{2}(B(0,r))}^{2}$ , we have

[TABLE]

The last line follows because $\chi_{\xi}$ is Hölder continuous, so stays close to $1$ near $\xi$ . Specifically, by Eq. 14, for $\kappa:=\left(2C_{\mathfrak{H}}\right)^{-1/\epsilon}$ we have $\chi_{\xi}(x)>\frac{1}{2}$ for all $x\in B(\xi,\kappa h)$ . Thus

[TABLE]

*The lemma follows with constant $C_{diag}=\frac{1}{4}c^{2}\Gamma_{1}^{2}\rho^{-2}\alpha_{\mathbb{M}}\kappa^{d}$ . *

This brings us to the lower bound for diagonal entries of the stiffness matrix. Define the diagonal of $\boldsymbol{\mathbf{A}}_{\Xi}$ as $\boldsymbol{\mathbf{B}}_{\Xi}:=\mathrm{diag}(\boldsymbol{\mathbf{A}}_{\Xi}),$ and note that by Lemma 3.1 and Lemma 3.5,

[TABLE]

is bounded above by a constant which depends only on the mesh ratio $\rho$ (and not on $\Xi$ ).

This permits us to find suitable damping constants $0<\theta<1$ so that $\boldsymbol{\mathbf{B}}_{\Xi}$ dominates $\theta\boldsymbol{\mathbf{A}}_{\Xi}$ . This drives the success of the damped Jacobi method considered in the next section.

Lemma 3.7.

*For an elliptic operator $\mathcal{L}$ , a kernel $\phi$ satisfying Assumption 1, and mesh ratio $\rho$ , there is $\theta\in(0,1)$ so that for any point set $\Xi\subset\mathbb{M}$ , $\theta\langle\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle$ for all $\boldsymbol{\mathbf{v}}\in\mathbb{R}^{\Xi}$ . *

Proof 3.8.

*By Eq. 19, $\langle\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle\leq C_{A}h^{d-2}\|v\|^{2}$ , while $C_{diag}h^{d-2}\|v\|^{2}\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{x}}\rangle$ follows from Lemma 3.5. Thus the lemma holds for any $\theta$ in the interval $(0,{C_{diag}}/{C_{A}}]$ . *

4 The smoothing property

In this section we define and study the smoothing operator used in the multigrid method. We focus on the damped Jacobi method for the linear system $\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{u}}^{\star}_{\Xi}=\boldsymbol{\mathbf{b}}$ , where $\boldsymbol{\mathbf{A}}_{\Xi}\in\mathbb{R}^{|\Xi|\times|\Xi|}$ and $\boldsymbol{\mathbf{b}}\in\mathbb{R}^{|\Xi|}$ . For a fixed damping parameter $0<\theta<1$ , $\boldsymbol{\mathbf{u}}^{(j)}$ is approximately computed via the iteration:

[TABLE]

with starting value $\boldsymbol{\mathbf{u}}^{(0)}\in\mathbb{R}^{|\Xi|}$ . Define the affine map governing a single iteration as

[TABLE]

For $k\geq 1$ , the associated damped Jacobi iteration is ${J}_{\Xi}^{k+1}(\boldsymbol{\mathbf{u}},\boldsymbol{\mathbf{b}}):={J}_{\Xi}({J}_{\Xi}^{k}(\boldsymbol{\mathbf{u}},\boldsymbol{\mathbf{b}}),\boldsymbol{\mathbf{b}})$ . For a starting value $\boldsymbol{\mathbf{u}}^{0}$ , we have $\boldsymbol{\mathbf{u}}^{(k)}={J}_{\Xi}^{k}(\boldsymbol{\mathbf{u}}^{(0)},\boldsymbol{\mathbf{b}})$ .

Since $\boldsymbol{\mathbf{u}}^{(k)}-\boldsymbol{\mathbf{u}}^{\star}=\left(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}\right)(\boldsymbol{\mathbf{u}}^{(k-1)}-\boldsymbol{\mathbf{u}}^{\star})$ , we have the identity $\boldsymbol{\mathbf{u}}^{(k)}-\boldsymbol{\mathbf{u}}^{\star}=\left(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}\right)^{k}(\boldsymbol{\mathbf{u}}^{(0)}-\boldsymbol{\mathbf{u}}^{\star}).$ This shows that the iteration converges if and only if

[TABLE]

called the smoothing matrix, is a contraction. This gives a recursive form for the error:

[TABLE]

In the context of kernel approximation, the corresponding operator, the smoothing operator, $\mathcal{S}_{\Xi}:V_{\Xi}\to V_{\Xi}$ is defined as $\mathcal{S}_{\Xi}\sigma_{\Xi}^{*}:=\sigma_{\Xi}^{*}\boldsymbol{\mathbf{S}}_{\Xi}.$ The success of the multi-grid method relies on a smoothing property, which for our purposes states that iterating $\mathcal{S}_{\Xi}$ is eventually contracting: $\|\mathcal{S}_{\Xi}^{\nu}\|_{L_{2}(\mathbb{M})\to\mathcal{L}}\leq Ch^{-1}o(\nu)$ as $\nu\to\infty$ . This smoothing property is demonstrated in Section 4.2.

4.1 $L_{2}$ stability of the smoothing operator

At this point, we are in a position to show that that iterating this operator is stable on $L_{2}(\mathbb{M})$ . To help analyze the matrix $\boldsymbol{\mathbf{S}}_{\Xi}=\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}$ , we introduce the inner product

[TABLE]

Since $\boldsymbol{\mathbf{B}}_{\Xi}$ is diagonal, and its diagonal entries are $\langle\chi_{\xi},\chi_{\xi}\rangle_{\mathcal{L}}\sim h^{d-2}$ , we have the norm equivalence $\|\boldsymbol{\mathbf{M}}\|_{\boldsymbol{\mathbf{B}}_{\Xi}\to\boldsymbol{\mathbf{B}}_{\Xi}}\sim\|\boldsymbol{\mathbf{M}}\|_{2\to 2}$ . Specifically, we have

[TABLE]

with constant of equivalence $C_{\boldsymbol{\mathbf{B}}}=\kappa(\boldsymbol{\mathbf{B}}_{\Xi}^{1/2})=\frac{\max\sqrt{a(\chi_{\xi},\chi_{\xi})}}{\min\sqrt{a(\chi_{\xi},\chi_{\xi})}}\leq\sqrt{C_{stiff}/C_{diag}}$ .

Lemma 4.1.

For the damping parameter $\theta$ in Lemma 3.7, there is $C>0$ so that for all $n$ , $\boldsymbol{\mathbf{u}}\in\mathbb{R}^{\Xi}$ and $u=\sigma_{\Xi}^{*}\boldsymbol{\mathbf{u}}\in V_{\Xi}$ ,

[TABLE]

We note that this holds even when $a$ is non-symmetric.

Proof 4.2.

The matrix $\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi})\boldsymbol{\mathbf{B}}_{\Xi}^{-1/2}$ is symmetric, and has the same spectrum as $\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}$ . Furthermore, because $0\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}-\theta\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle$ , the spectral radius of $\boldsymbol{\mathbf{S}}_{\Xi}$ is no greater than $1$ . Thus

[TABLE]

*It follows that $\|\boldsymbol{\mathbf{S}}_{\Xi}^{n}\|_{\boldsymbol{\mathbf{B}}_{\Xi}\to\boldsymbol{\mathbf{B}}_{\Xi}}\leq 1$ as well, and the matrix norm equivalence Eq. 22 guarantees that $\|\boldsymbol{\mathbf{S}}_{\Xi}^{n}\|_{2\to 2}\leq\kappa(\boldsymbol{\mathbf{B}}_{\Xi}^{1/2})$ . The second inequality follows from the Riesz property Eq. 15. *

4.2 Smoothing properties

For $v=\sigma_{\Xi}^{*}\boldsymbol{\mathbf{v}}\in V_{\Xi}$ , we have $\|v\|_{\mathcal{L}}=\sqrt{a(v,v)}=\sqrt{\langle\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle}$ . By applying Cauchy-Schwarz, the Riesz property and Lemma 4.1, the chain of inequalities

[TABLE]

holds. In the case that $a$ is symmetric, we have the following smoothness property.

Lemma 4.3.

For a given $\rho>0$ there is a constant $C$ so that for any $\theta$ chosen as in Lemma 3.7, and $\Xi\subset\mathbb{M}$ with mesh ratio $\rho(\Xi)\leq\rho$ , the damped Jacobi iteration has smoothing operator $\mathcal{S}_{\Xi}\in L(V_{\Xi})$ which satisfies

[TABLE]

Proof 4.4.

*This is a result of [12, Theorem 7.9], which shows that $\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{S}}_{\Xi}^{n}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}\leq\frac{Ch^{d-2}}{\theta(\nu+1)}\|\boldsymbol{\mathbf{v}}\|_{\ell_{2}}$ . It follows that $\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{S}}_{\Xi}^{n}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}\leq\frac{Ch^{d/2-2}}{\theta(n+1)}\|{v}\|_{L_{2}}$ , and the result holds by the above discussion. *

When $a$ is not symmetric, we have the smoothness property.

Lemma 4.5.

Let $\mathcal{S}_{\Xi}:V_{\Xi}\to V_{\Xi}:v=\sum v_{\xi}\chi_{\xi}\mapsto\sum(\boldsymbol{\mathbf{S}}_{\Xi}\boldsymbol{\mathbf{v}})_{\xi}\chi_{\xi}$ . For $\theta$ as in Lemma 3.7 we have

[TABLE]

Proof 4.6.

*This follows from [12, Theorem 7.17], and by techniques of the proof of Lemma 4.3. *

5 The direct kernel multigrid method

We are now in a position to consider the multigrid method applied to the kernel based Galerkin method we have described in the previous sections.

5.1 Setup: Grid transfer

We consider a nested sequence of point sets

[TABLE]

and associated kernel spaces $V_{\Xi_{\ell}}$ as described in Section 2.5. We denote the Lagrange basis for each such space $(\chi^{(\ell)}_{\xi})_{\xi\in\Xi_{\ell}}$ , and with it the accompanying analysis map $\sigma_{\ell}:=\sigma_{\Xi_{\ell}}$ , synthesis map $\sigma_{\ell}^{*}:=\sigma_{\Xi_{\ell}}^{*}$ and stiffness matrix $\boldsymbol{\mathbf{A}}_{\ell}:=\boldsymbol{\mathbf{A}}_{\Xi_{\ell}}$ . Moreover, we assume that there are constants $0<\gamma_{1}\leq\gamma_{2}<1$ and $\rho\geq 1$ such that

[TABLE]

Note at this point that $\rho$ is a universal constant. Hence $\rho$ does not depend on $\ell$ and thus the constants in deriving the smoothing property do not depend on $\ell$ . Thus, we obtain $n_{\ell}\sim h^{-d}_{\Xi_{\ell}}$ , for constants see Eq. 2. We will assume that $L$ is the largest index that we will consider.

In this section, we discuss grid transfer: specifically, the operators and matrices which provide communication between finite dimensional kernel spaces. These include natural prolongation and restriction maps and their corresponding matrices. We show how these can be used to relate Galerkin projectors $P_{\Xi_{\ell}}$ and stiffness matrices $\boldsymbol{\mathbf{A_{\ell}}}$ .

Prolongation and restriction

Denote the Lagrange basis of $V_{\Xi_{\ell}}$ by $(\chi_{\xi}^{(\ell)})$ , and note that by containment $V_{\Xi_{\ell-1}}\subset V_{\Xi_{\ell}}$ , it follows that $\chi^{(\ell-1)}_{\xi}=\sum_{\eta\in\Xi_{\ell}}\beta_{\xi,\eta}\chi^{(\ell)}_{\eta}$ holds for some matrix of coefficients $\beta_{\xi,\ell}$ . Furthermore, from the Lagrange property, the identity $\chi^{(\ell-1)}_{\xi}(\eta)=\sum_{\zeta\in\Xi_{\ell}}\chi^{(\ell-1)}_{\xi}(\zeta)\chi^{(\ell)}_{\zeta}(\eta).$ holds for any $\eta\in\Xi_{\ell-1}$ . By uniqueness, we deduce that $\beta_{\xi,\eta}=\chi^{(\ell-1)}_{\xi}(\eta)$ , and we have

[TABLE]

This yields that the natural injection $\mathcal{I}_{\ell-1}^{(\ell)}:V_{\Xi_{\ell-1}}\to V_{\Xi_{\ell}}$ , called the prolongation map, which is described by the rectangular matrix $\boldsymbol{\mathbf{p}}_{\ell}:=\left(\chi^{(\ell-1)}_{\xi}(\eta)\right)_{\xi\in\Xi_{\ell-1},\eta\in\Xi_{\ell}}=(\sigma_{\ell}^{*})^{-1}\sigma_{\ell-1}^{*}\ \in\mathbb{R}^{n_{\ell}\times n_{\ell-1}}$ .

It is worth noting that $\mathcal{I}_{\ell-1}^{\ell}\sigma_{\ell-1}^{*}=\sigma_{\ell-1}^{*}$ , so we have the identity

[TABLE]

The corresponding *restriction map * $\mathcal{I}_{\ell}^{(\ell-1)}:V_{\Xi_{\ell}}\to V_{\Xi_{\ell-1}}$ is described by the transposed matrix $\boldsymbol{\mathbf{r}}_{\ell}=\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T}$ . In other words, it is defined as $\mathcal{I}_{\ell}^{(\ell-1)}\sigma_{\ell}^{*}=\sigma_{\ell-1}^{*}\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T}$ .

Note that we can use $\boldsymbol{\mathbf{r}}_{\ell}=\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T}$ to relate analysis maps at different levels, since we can take the $a$ -adjoint of both sides of the equation Eq. 24 to obtain the following useful identity:

[TABLE]

Moreover, the prolongation is both bounded from above and below. This is a kernel based analogue for [12, Eq. (64)].

Lemma 5.1.

Using the notation from above, there is a constant $C_{pro}\geq 1$ depending on $\gamma$ , $\rho$ , $\mathbb{M}$ and the constants in Assumption 2 so that

[TABLE]

*holds for all $\ell\geq 1$ . *

Proof 5.2.

We begin by estimating the $\ell_{1}(\Xi_{\ell-1})\to\ell_{1}(\Xi_{\ell})$ and $\ell_{\infty}(\Xi_{\ell-1})\to\ell_{\infty}(\Xi_{\ell})$ norms of $\boldsymbol{\mathbf{p}}_{\ell}$ by taking column and row sums, respectively. These estimates can be made almost simultaneously, because the $(\xi,\eta)$ entry of $\boldsymbol{\mathbf{p}}_{\ell}$ satisfies the bound $|\chi^{(\ell-1)}_{\xi}(\eta)|\leq C_{pw}\exp(-\nu\frac{\mathrm{dist}(\xi,\eta)}{h_{\ell-1}})$ by Eq. 13.

Let $A:=\sum_{n=1}^{\infty}(n+2)^{d}\exp(-\frac{\nu}{\rho}n)$ and $B:=\sum_{n=1}^{\infty}(n+2)^{d}\exp(-\frac{\nu\gamma}{\rho}n)$ , and note that both numbers depend on $\rho,\gamma$ and the exponential decay rate $\nu$ from Eq. 13. Applying Eq. 3, gives

[TABLE]

Finally, interpolation gives an upper bound, with $C_{pro}:=C_{pw}\sqrt{(1+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}B)(1+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}A)}$ .

*By using the definition $\Upsilon_{\ell}:=\Xi_{\ell}\setminus\Xi_{\ell-1}$ we write $\boldsymbol{\mathbf{p}}_{a}=(\chi_{\xi}(\eta))_{\xi,\eta\in\Xi_{\ell-1}}$ and $\boldsymbol{\mathbf{p}}_{b}=(\chi_{\xi}(\zeta))_{\xi\in\Xi_{\ell-1},\zeta\in\Upsilon_{\ell}}$ . Thus, we have $\left\|\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{c}}\right\|^{2}_{2}=\|\boldsymbol{\mathbf{c}}\|^{2}_{2}+\|\boldsymbol{\mathbf{p}}_{b}\tilde{\boldsymbol{\mathbf{c}}}\|^{2}_{2}\geq\|\boldsymbol{\mathbf{c}}\|^{2}_{2}$ , which gives the lower bound. *

5.2 Multigrid iteration – two level case

We now describe the multigrid algorithm, which is a composition of smoothing operators, restriction, coarse grid correction, prolongation, and then smoothing.

We begin by considering the solution of $\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{u}}_{\ell}^{\star}=\boldsymbol{\mathbf{b}}_{\ell}$ , where $\boldsymbol{\mathbf{A}}_{\ell}$ is the stiffness matrix associated to $V_{\Xi_{\ell}}$ , and where $\boldsymbol{\mathbf{b}}_{\ell}=\sigma_{\ell}u_{\ell}^{\star}$ , is the data obtained from the Galerkin solution $u_{\ell}^{\star}=\sigma^{\ast}_{\ell}\boldsymbol{\mathbf{u}}_{\ell}^{\star}\in V_{\Xi_{\ell}}$ . Naturally, $u_{\ell}^{\star}$ is unknown (its coefficients are the solution of the above problem), but we can compute the data $\sigma_{\ell}{u}_{\ell}^{\star}$ via

[TABLE]

In other words, it is obtained from the right hand side $f$ .

The output $\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}=\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})$ of the two-level multigrid algorithm with initial input $\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}$ is given by the rather complicated formula

[TABLE]

This can be simplified if we make an affine correction: by considering the error $\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}-\boldsymbol{\mathbf{u}}_{\ell}^{\star}$ . We use Eq. 21 to express the affine smoothing operators in terms of a matrix product.

We can write $\boldsymbol{\mathbf{b}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell}J^{\nu_{1}}_{\ell}(\boldsymbol{\mathbf{u}}_{\ell}^{\text{old}},\boldsymbol{\mathbf{b}}_{\ell})$ as the expression $\boldsymbol{\mathbf{A}}_{\ell}\left(\boldsymbol{\mathbf{u}}_{\ell}^{\star}-J^{\nu_{1}}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})\right)$ , and then, by employing Eq. 21 again, we have $-\boldsymbol{\mathbf{A}}_{\ell}\left(\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{1}}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}-\boldsymbol{\mathbf{u}}_{\ell}^{\star})\right)$ . Now we can rewrite $\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}$ as

[TABLE]

As in [12, Eq. 48], by using the two grid iteration matrix

[TABLE]

the error can be expressed as $\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}-\boldsymbol{\mathbf{u}}_{\ell}^{\star}=\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell})-\boldsymbol{\mathbf{u}}^{\star}_{\ell}=\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}-\boldsymbol{\mathbf{u}}^{\star}_{\ell})$ .

The corresponding operator on $V_{\Xi_{\ell}}$ is obtained by conjugating with $\sigma_{\ell}^{*}$ . This gives the error operator for the two level method $\mathcal{C}_{\text{TG}_{\ell}}\sigma_{\ell}^{*}:=\sigma_{\ell}^{*}\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}$ .

It is worth noting that by Eq. 24 we have the equality $\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}=\sigma_{\ell-1}^{*}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}$ . Using the identity $\boldsymbol{\mathbf{A}}_{\ell}=\sigma_{\ell}\sigma_{\ell}^{*}$ followed by 25 and the identity $P_{\Xi_{\ell-1}}=\sigma^{\ast}_{\ell-1}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\sigma_{\ell-1}$ gives

[TABLE]

It follows that $\mathcal{C}_{\text{TG}_{\ell}}=\mathcal{S}_{\ell}^{\nu_{2}}(\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}$ .

We are now in a position to show that the two level method is a contraction for sufficiently large values of $\nu_{1}$ .

Proposition 5.3.

There is a constant $C$ so that for all $\ell$ , $\mathcal{C}_{\text{TG}_{\ell}}$ satisfies the bound

[TABLE]

Proof 5.4.

We have that $\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|=\left\|\mathcal{S}_{\ell}^{\nu_{2}}(\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}\right\|$ holds, so Lemma 4.1 ensures

[TABLE]

By Lemma 2.1, $\left\|\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}}\right\|_{W_{2}^{1}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq Ch_{\ell-1}$ holds, so it follows that

[TABLE]

*By coercivity, this gives $\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq Ch_{\ell-1}\|\mathcal{S}_{\ell}^{\nu_{1}}v\|_{L_{2}(\mathbb{M})\to\mathcal{L}}$ . Finally, the result follows by applying the smoothing property: Lemma 4.3 in the symmetric case and Lemma 4.5 in the non-symmetric case. *

Corollary 5.5.

Let $\theta$ as in Lemma 3.7 and let $\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}\in\mathbb{R}^{n_{\ell}}$ be an initial guess, $u^{\text{old}}_{\ell}=\sigma_{\ell}^{*}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell})$ , $\boldsymbol{\mathbf{u}}_{\ell}=\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell})$ , and $u_{\ell}=\sigma_{\ell}^{*}\boldsymbol{\mathbf{u}}_{\ell}$ . If $a$ is symmetric, there is a constant $C$ independent of $\ell$ and $\theta$ so that

[TABLE]

If $a$ is not symmetric, there is a constant $C$ independent of $\ell$ and $\theta$ so that

[TABLE]

Proof 5.6.

*This follows by applying Proposition 5.3 to $u^{\text{old}}_{\ell}-u_{\ell}^{\star}$ . *

5.3 Multigrid with $\tau$ -cycle

In the two-grid method, the computational bottleneck remains the solution on the coarse grid. Thus, there have been many approaches to recursively apply the multi-grid philosophy in order to use a direct solver only on the coarsest grid. A flexible algorithm is the so-called $\tau$ -cycle. Here $\tau=1$ stands for the $V$ -cycle in multi-grid methods and $\tau=2$ gives the $W$ -cycle.

Our results hold for $\tau\geq 2$ .

Before proving our main theorem, we need a statement from elementary real analysis.

Lemma 5.7.

For any real numbers $\alpha,\beta,\gamma,\tau$ which satisfy $0<\gamma<1$ , $\tau\geq 2$ , $\beta>1/\tau$ and $\alpha<\min\left\{\frac{\tau-1}{\tau}(\beta\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}$ , if the sequence $(x_{n})_{n\in\mathbb{N}_{0}}$ satisfies the conditions

[TABLE]

*then $x_{n}\leq\gamma$ for all and all $n\geq 0$ . *

Proof 5.8.

*This follows by elementary calculations, as in [8, Lemma 6.15]. *

Using [12, Theorem 7.1], we obtain for the iteration matrix of the Algorithm 2 the recursive (in the level) form

[TABLE]

Again, we define the corresponding operator via $\mathcal{C}_{\ell}:=\sigma_{\ell}^{*}\boldsymbol{\mathbf{C}}_{\ell}(\sigma_{\ell}^{*})^{-1}$ .

Theorem 5.9.

For every $\gamma\in(0,1)$ , there is a $\nu^{\star}:=\arg\min_{\nu\in\mathbb{N}}\{\nu\in\mathbb{N}\ :\ C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu)\leq\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}\}$ . For all $\nu_{1}\geq\nu^{\star}$ we have

[TABLE]

Proof 5.10.

Here, we follow basically [12, Poof of Theorem 7.20]. Let $\boldsymbol{\mathbf{v}}\in\mathbb{R}^{n_{\ell}}$ arbitrary. For $v=\sigma^{*}_{\ell}\boldsymbol{\mathbf{v}}$ , we obtain for $\ell\in\mathbb{N}$ ,

[TABLE]

by 5.3 and Lemma 4.1. We treat the second term with Eq. 24 and Eq. 29 to obtain

[TABLE]

This leaves

[TABLE]

The last factor can be bounded by writing $P_{\Xi_{\ell-1}}=\mathrm{id}-(\mathrm{id}-P_{\Xi_{\ell-1}})$ followed by the triangle inequality. Lemma 4.1 bounds $\|\mathcal{S}_{\ell}^{\nu_{1}}\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$ , while Proposition 5.3 (with $\nu_{2}=0$ ) bounds $\|(\mathrm{id}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$ . Thus, we end up with a bound

[TABLE]

which has the form required by Lemma 5.7, with $x_{\ell}=\left\|\mathcal{C}_{\ell}\right\|_{L_{2}\to L_{2}}$ , $\beta_{\text{Thm.\ref{thm:main}}}:=C$ and $\alpha=\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$ . The condition $\nu_{1}\geq\nu^{\star}$ ensures the bound $\alpha\leq\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}\}$ . Thus

[TABLE]

*holds by Lemma 5.7, and the theorem follows. *

Remark 5.11.

At the finest level, the kernel-based Galerkin problem $\boldsymbol{\mathbf{A}}_{L}\boldsymbol{\mathbf{x}}=\boldsymbol{\mathbf{b}}_{L}$ , can be solved stably to any precision $\epsilon_{\max}$ , by iterating the contraction matrix $\boldsymbol{\mathbf{C}}_{L}{\tau}$ . Select $\gamma<1$ and fix $\nu_{1}$ so that Theorem 5.9 holds. Letting $\boldsymbol{\mathbf{u}}^{(k+1)}=\operatorname{MGM}^{(\tau)}_{L}(\boldsymbol{\mathbf{u}}^{(k)},\boldsymbol{\mathbf{b}}_{\ell})$ gives $\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\ell_{2}}\leq\gamma^{k}\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(0)}\|_{\ell_{2}}$ . If $k$ is the least integer satisfying $\gamma^{k}\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(0)}\|<\epsilon_{\max}$ , then $k\sim\frac{1}{\log\gamma}\log\left(\frac{\epsilon_{\max}}{\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(0)}\|}\right)$ .

*We note that $\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}\sim\|\sigma_{L}^{*}\boldsymbol{\mathbf{u}}^{*}-\sigma_{L}^{*}\boldsymbol{\mathbf{u}}^{(k)}\|_{W_{2}^{1}}\leq C_{\mathfrak{B}}h^{d/2-1}\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\ell_{2}},$ and since $d\geq 2$ , achieving $\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}<\epsilon_{\max}$ also requires only a fixed number of iterations. This shows Eq. 42. *

6 The perturbed multigrid method

In this section, we consider a modified problem

[TABLE]

where $\boldsymbol{\mathbf{\check{A}}}_{L}$ is close to $\boldsymbol{\mathbf{A}}_{L}$ . The perturbed multigrid method will produce an approximate solution $\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}$ to $\boldsymbol{\mathbf{u}}^{\star}_{L}$ which satisfies $\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|\leq\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|+\|\boldsymbol{\mathbf{\check{A}}}_{L}^{-1}-\boldsymbol{\mathbf{A}}_{L}^{-1}\|\|\boldsymbol{\mathbf{b}}_{L}\|$ . Thus, for the true solution $u$ to Eq. 8 and the Galerkin solution $u^{\star}_{L}=P_{\Xi_{L}}u=\sigma_{L}^{*}\boldsymbol{\mathbf{u}}^{\star}_{L}$ , we have

[TABLE]

which can be made as close to $\|(1-P_{\Xi_{L}})u\|_{L_{2}}$ as desired by controlling the perturbation $\|\boldsymbol{\mathbf{\check{A}}}_{L}-\boldsymbol{\mathbf{A}}_{L}\|_{2\to 2}$ and the error from the multi-grid approximation $\|\boldsymbol{\mathbf{\check{u}}}_{L}^{(k)}-\boldsymbol{\mathbf{\check{u}}}_{L}^{\star}\|$ .

Such systems may occur for a number of reasons: using localized Lagrange basis functions (as in [11]), truncating a series expansion of the kernel (as in [1]), or by using quadrature to approximate the stiffness matrix (both [11] and [1]). In the next section, we will apply this by truncating the original stiffness matrix to employ only banded matrices and thereby enjoy a computational speed up.

Perturbed multigrid method

The perturbed multigrid method replaces matrices $\boldsymbol{\mathbf{A}}_{\ell}$ , $\boldsymbol{\mathbf{p}}_{\ell}$ and $\boldsymbol{\mathbf{r}}_{\ell}$ appearing in Algorithms 1 and 2 with matrices $\boldsymbol{\mathbf{\check{A}}}_{\ell}$ , $\check{\boldsymbol{\mathbf{p}}}_{\ell}$ and $\check{\boldsymbol{\mathbf{r}}}_{\ell}$ We assume that for each $\ell$ there exists $0<\epsilon_{\ell}$ so that

[TABLE]

In this set up $\epsilon_{\ell}$ may change per level.333Which could be the case, e.g., if $\boldsymbol{\mathbf{\check{A}}}_{\ell}$ involved a $\Xi_{\ell}$ dependent quadrature scheme, or was obtained by bandlimiting (as we will do in the next section) We assume $\epsilon_{\ell}$ is sufficiently small that

[TABLE]

It then follows from standard arguments that

[TABLE]

Because of the entry-wise error $|(\boldsymbol{\mathbf{\check{A}}}_{\ell})_{\xi,\xi}-(\boldsymbol{\mathbf{A}}_{\ell})_{\xi,\xi}|\leq\epsilon_{\ell}$ , we also have that the diagonal matrix $\boldsymbol{\mathbf{\check{B}}}_{\ell}=\operatorname{diag}(\boldsymbol{\mathbf{\check{A}}}_{\ell})$ satisfies

[TABLE]

Therefore, there is $\theta$ so that for all $\ell$ and all $\boldsymbol{\mathbf{x}}\in\mathbb{R}^{n_{\ell}}$ , $\theta\langle\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle\leq\langle\boldsymbol{\mathbf{\check{B}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle$ holds. This permits us to consider the Jacobi iteration applied to the perturbed linear system $\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{x}}_{\ell}=\boldsymbol{\mathbf{b}},$ which yields $\check{J}_{\ell}:\mathbb{R}^{n_{\ell}}\times\mathbb{R}^{n_{\ell}}\to\mathbb{R}^{n_{\ell}}$ defined by

[TABLE]

Since $\boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}=\theta\Bigl{(}\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}(\boldsymbol{\mathbf{\check{A}}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell})+(\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}-\boldsymbol{\mathbf{B}}_{\ell}^{-1})\boldsymbol{\mathbf{A}}_{\ell}\Bigr{)}$ , we can estimate the error between smoothing matrices as

[TABLE]

Because $\theta\langle\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle\leq\langle\boldsymbol{\mathbf{\check{B}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle$ it follows that for all $n$ ,

[TABLE]

This also yields the following Lemma.

Lemma 6.1.

For $M\in\mathbb{N}$ , we get the bound

[TABLE]

Proof 6.2.

By telescoping, we have $\boldsymbol{\mathbf{S}}_{\ell}^{M}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M}=\sum_{j=0}^{M-1}\boldsymbol{\mathbf{S}}_{\ell}^{M-1-j}\left(\boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}\right)\boldsymbol{\mathbf{\check{S}}}_{\ell}^{j}$ . The inequality

[TABLE]

*follows from norm properties, and the result follows by applying Eq. 33 and Lemma 4.1. *

This lemma can be combined with the estimate Eq. 32 to obtain

[TABLE]

Perturbed two grid method

Now, we consider the perturbed version of the two step algorithm. We aim to apply the two-grid method with only truncated matrices to the problem

[TABLE]

Applying the two-grid method to Eq. 35, we obtain $\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}-\check{\boldsymbol{\mathbf{u}}}^{\text{new}}_{\ell}=\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\left(\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}-\check{\boldsymbol{\mathbf{u}}}^{\text{old}}_{\ell}\right),$ where the two grid iteration matrix is

[TABLE]

Lemma 6.3.

If $\epsilon_{\ell}\leq h_{\ell}^{d+2}$ holds for all $\ell\leq L$ , then

[TABLE]

Remark 6.4.

A basic idea, used throughout this section, is the following result: If $\max(\|M_{j}\|,\|\check{M}_{j}\|)\leq C_{j}$ , then

[TABLE]

Proof 6.5 (Proof of Lemma 6.3).

Consider

[TABLE]

By Remark 6.4, we have

[TABLE]

Now, we consider the difference

[TABLE]

*Because $\|\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\|\leq Ch_{\ell}^{-2}$ and $\|\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\|\leq Ch_{\ell}^{-2}$ , the lemma follows by using Remark 6.4 with Eqs. 32 and 34, Lemma 6.1, and the above estimate of $\|E_{1}\|$ . *

Perturbed $\tau$ -cycle

As in the two-step case we consider the multigrid method also for the truncated system in Eq. 35, i.e., $\boldsymbol{\mathbf{\check{A}}}_{\ell}\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}=\boldsymbol{\mathbf{b}}_{\ell}=\sigma_{\ell}u^{\star}_{\ell}$ . The multi-grid iteration matrix is

[TABLE]

From this we define the operator $\mathcal{\check{C}}_{\ell}:=\sigma_{\ell}^{*}\boldsymbol{\mathbf{\check{C}}}_{\ell}(\sigma_{\ell}^{*})^{-1}$ .

Theorem 6.6.

For any $0<\gamma<1$ , there exist constants $C_{1}$ , $C_{2}$ and $C_{4}$ and $\nu^{\star}\in\mathbb{N}$ such that $C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu^{\star})\leq C_{4}\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}$ . For $\nu_{1}\geq\nu^{\star}$ choose $\epsilon_{\ell}$ small enough such that $\epsilon_{\ell}h_{\ell}^{-(d+2)}(h_{\ell}^{-4}+\nu_{1}+\nu_{2})<C^{-1}_{\text{Lem.\ref{lem:2step_pert}}}\min(C_{1},\gamma C_{2})$ for $0\leq\ell\leq L$ for all $\ell$ and if $h_{0}\leq\left(C^{-1}_{\text{Lem.\ref{lem:2step_pert}}}\min(C_{1},\gamma C_{2})\right)^{-1/4}$ , then

[TABLE]

Proof 6.7.

As in the proof of Theorem 5.9, we make the estimate

[TABLE]

Then Eq. 33 ensures that $\|\sigma_{\ell}^{*}\check{\boldsymbol{\mathbf{p}}}_{\ell}(\sigma_{\ell-1}^{*})^{-1}\mathcal{\check{C}}_{\ell-1}^{\tau}\sigma_{\ell-1}^{*}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{\check{S}}_{\ell}^{\nu_{1}}\|$ controls the second expression. Considering the difference $E:=\sigma_{\ell}^{*}\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{\check{S}}_{\ell}^{\nu_{1}}-\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}$ , Remark 6.4 gives

[TABLE]

Using the Riesz property, this gives

[TABLE]

As in the proof of Theorem 5.9, the last normed expression can be bounded as

[TABLE]

Because $(h_{\ell}^{-2}+\nu_{1})h_{\ell}^{-d}\epsilon_{\ell}$ is bounded by assumption, it follows that

[TABLE]

As by assumption $\epsilon_{\ell}\leq h_{\ell}^{d+2}$ (no constants due to $h\leq h_{0}$ ) holds for all $\ell\leq L$ , we obtain by Lemma 6.3

[TABLE]

We use Theorem 5.9 and choose a natural number $\nu^{\star}_{1}$ large enough such that the inequality $C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu^{\star}_{1})\leq C_{4}\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}$ is satisfied. Thus, we obtain

[TABLE]

We have

[TABLE]

Thus, we define

[TABLE]

*Hence Lemma 5.7 applies and the result follows. *

7 Truncated multigrid method

In this section we consider truncating the stiffness, prolongation and restriction matrices in order to improve the computational complexity of the method. Each such matrix has stationary, exponential off-diagonal decay, so by retaining the $(\xi,\eta)$ entry when $\mathrm{dist}(\xi,\eta)\leq Kh_{\ell}|\log h_{\ell}|$ , and setting the rest to zero, guarantees a small perturbation error (on the order of $\mathcal{O}(h_{\ell}^{J})$ , where $J\propto K$ ). This is made precise in Lemmas 7.3 and 7.5 below, with the aid of the following lemma.

Lemma 7.1.

Suppose $\Xi\subset\mathbb{M}$ , $c>0$ , and $r\geq 2q(\Xi)$ . Then for any $\eta\in\mathbb{M}$ , we have

[TABLE]

Proof 7.2.

The underlying set can be decomposed as $\{\xi\in\Xi\mid\mathrm{dist}(\xi,\eta)\geq r\}=\bigcup_{j=0}^{\infty}\mathcal{A}_{j}$ , where $\mathcal{A}_{j}=\{\xi\in\Xi\mid r+jq\leq\mathrm{dist}(\xi,\eta)<r+(j+1)q\}$ has cardinality $\#\mathcal{A}_{j}\leq\frac{\alpha_{\mathbb{M}}}{\beta_{\mathbb{M}}}\bigl{(}\frac{r}{q}+(j+2)\bigr{)}^{d}$ . It follows that

[TABLE]

*and the lemma follows from the fact that $\frac{r}{q}+j+2\leq\frac{r}{q}(j+2)$ for all $j\geq 0$ . *

Truncated stiffness matrix

The exponential decay in Lemma 3.1 motivates the truncation of the stiffness matrix, see e.g. [11, Eq. (8.1)] We define for positive $K$ , the truncation parameter $r_{\Xi}:=Kh|\log(h)|$

[TABLE]

We note that $\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}$ is symmetric if $\boldsymbol{\mathbf{A}}_{\Xi}$ is symmetric. By construction and quasi-uniformity, we obtain $\#\{\xi\in B(\eta,r_{\Xi})\cap\Xi\}\leq\rho^{d}h^{-d}_{\Xi,\mathbb{M}}\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}(r_{\Xi}+q)^{d}\leq 2\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\rho^{d}|K\log h|^{d}.$ By $h^{-d}\leq C|\Xi|$ , this yields

$\#\{\xi\in\Xi\,\mid\,(\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}})_{\xi,\eta}\neq 0\}\leq 2\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\rho^{d}d^{d}K^{d}(\log|\Xi|)^{d}$ .

In particular, we obtain

[TABLE]

for the number of operations for a matrix vector multiplication with the truncated stiffness matrix, with $C_{comp}=2\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\rho^{d}d^{d}$ .

Lemma 7.3.

Define the global parameter $C_{\text{tr}}:=\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\sum_{n=1}^{\infty}\left(n+2\right)^{d}e^{-\frac{\nu}{\rho}n}$ . Then the estimate

[TABLE]

*holds. *

Proof 7.4.

The proof for the first statement is essentially given in [11, Prop 8.1]. Using Lemma 3.1, we observe, by symmetry, that $\|\boldsymbol{\mathbf{A}}_{\Xi}-\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\|_{p\to p}$ is controlled by the maximum of the $\ell_{1}$ and $\ell_{\infty}$ matrix norms, which can be controlled by row and column sums. This leads to off-diagonal sums $\max_{\eta\in\Xi}\sum_{\xi\in\Xi\cap B^{\complement}(\eta,r_{\Xi})}\left|(\boldsymbol{\mathbf{A}}_{\Xi})_{\xi,\eta}\right|$ and $\max_{\xi\in\Xi}\sum_{\eta\in\Xi\cap B^{\complement}(\eta,r_{\Xi})}\left|(\boldsymbol{\mathbf{A}}_{\Xi})_{\xi,\eta}\right|.$ Lemma 7.1 with $r=r_{\Xi}=Kh|\log h|$ and $c=\frac{\nu}{2h}$ yields

[TABLE]

Truncated prolongation and restriction matrices

We introduce truncated prolongation matrices

[TABLE]

where we use the notation $r_{\ell}:=r_{\Xi_{\ell-1}}$ . Likewise, we define $\check{\boldsymbol{\mathbf{r}}}_{\ell;r_{\ell}}:=\left(\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\right)^{T}$ . For the numerical costs, we obtain

[TABLE]

where we use that $|\Xi_{\ell}|\sim h_{\ell}^{-d}=\gamma^{d}h^{d}_{\ell-1}\sim|\Xi_{\ell-1}|\sim\gamma^{d}|\Xi_{\ell-1}|$ due to Eq. 23.

Lemma 7.5.

We have

[TABLE]

Proof 7.6.

We proceed as in the proof of Eq. 39. We can estimate row and column sums of $\boldsymbol{\mathbf{p}}_{\ell}-\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}$ by Eq. 13, obtaining $\left\|\boldsymbol{\mathbf{p}}_{\ell}-\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\right\|_{\infty\to\infty}\leq C_{pw}\sum_{\xi\in\Xi_{\ell-1}\cap B^{\complement}(\eta,r_{\ell})}e^{-\nu\frac{\mathrm{dist}(\eta,\xi)}{h_{\ell-1}}}$ , so, by Lemma 7.1 with $r=r_{\ell}=Kh_{\ell}|\log h_{\ell}|$ and $c=\frac{\nu}{h_{\ell-1}}$ , we have , we have

[TABLE]

Likewise, $\left\|\boldsymbol{\mathbf{p}}_{\ell}-\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\right\|_{1\to 1}\leq C_{pw}\sum_{\eta\in\Xi_{\ell}\cap B^{\complement}(\xi,r_{\ell})}e^{-\nu\frac{\mathrm{dist}(\eta,\xi)}{h_{\ell-1}}}$ . Lemma 7.1 yields this time with $r=r_{\ell}=Kh_{\ell}|\log h_{\ell}|$ and $c=\frac{\nu}{h_{\ell-1}}$ , gives

[TABLE]

Thus, we get

[TABLE]

*Interpolation finishes the proof. *

Truncated $\tau$ -cycle

We now consider the multigrid method using truncated stiffness, prolongation and restriction matrices. We denote this by $\operatorname{MGMTRUNC}^{(\tau)}_{\ell}$ , and use it to solve Eq. 35 with $\boldsymbol{\mathbf{\check{A}}}_{\ell}=\check{\boldsymbol{\mathbf{A}}}_{\ell;r_{\ell}}$ . Lemmas 7.3 and 7.5 show that conditions for Theorem 6.6 are satisfied when $K$ is chosen sufficiently large.

Theorem 7.7.

If $\tau\gamma^{d}<1$ , we obtain

[TABLE]

Proof 7.8.

Define the floating point operation count for the truncated multi-grid method by ${\tt M}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\operatorname{MGMTRUNC}^{(\tau)}_{\ell}(\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{b}}_{\ell}))$ .

By estimates Eqs. 38 and 40, ${\tt P}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}})$ , ${\tt R}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{r}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}})$ , and ${\tt A}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{A}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}})$ are each bounded by $CK^{d}\log(|\Xi_{\ell}|)^{d}|\Xi_{\ell}|$ . Because each Jacobi iteration involves multiplication by a matrix with the same number of nonzero entries, we note that

[TABLE]

as well. From this, we have the recursive formula

[TABLE]

Applying Eq. 38 and Eq. 40 gives ${\tt M}_{\ell}\leq CK^{d}(\nu_{1}+\nu_{2}+3)\left(|\log h_{\ell}|^{d}h^{-d}_{\ell}\right)+\tau{\tt M}_{\ell-1}.$

By setting $w_{\ell}=\left(|\log h_{\ell}|/h_{\ell}\right)^{d}$ and $\tilde{C}:=CK^{d}(\nu_{1}+\nu_{2}+3)$ , we have the recurrence

[TABLE]

Note that $w_{\ell}\leq(|\log h_{0}|+\ell|\log\gamma|)^{d}\gamma^{-d\ell}h_{0}^{-d}$ , since $h_{\ell}\leq\gamma^{\ell}h_{0}$ . By Hölder’s inequality, we have $(|\log h_{0}|+\ell|\log\gamma|)^{d}\leq 2^{\frac{d-1}{d}}(|\log h_{0}|^{d}+\ell|\log\gamma|^{d})$ , which provides the estimate $w_{\ell-k}\leq 2^{\frac{d-1}{d}}h_{0}^{-d}\gamma^{-d\ell}\gamma^{dk}(|\log h_{0}|^{d}+\ell(1-k/\ell)|\log\gamma|^{d})$ .

Applying this to the above estimate for ${\tt M}_{\ell}$ gives,

[TABLE]

*The result follows by taking $(\gamma^{\ell}h_{0})^{-d}\leq Ch^{-d}$ and $(|\log h_{0}|+\ell|\log\gamma|)\sim|\log h|$ . *

Remark 7.9.

The kernel-based Galerkin problem $\check{\boldsymbol{\mathbf{A}}}_{L;r_{L}}\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}=\boldsymbol{\mathbf{b}}_{L}$ , can be solved stably to any precision $\epsilon_{\max}$ , by iterating the algorithm $\operatorname{MGMTRUNC}^{(\tau)}_{L}(\check{\boldsymbol{\mathbf{u}}}_{L},\boldsymbol{\mathbf{b}}_{L})$ , i.e., the truncated multi-grid with $\tau\geq 2$ cycle. Select $\gamma<1$ and fix $\nu_{1}$ so that Theorem 6.6 holds. Let $\check{\boldsymbol{\mathbf{u}}}_{L}^{(k+1)}=\operatorname{MGMTRUNC}^{(\tau)}_{L}(\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)},\boldsymbol{\mathbf{b}}_{L})$ . If $k$ is the least integer satisfying $\gamma^{k}\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(0)}\|_{\ell_{2}}<\epsilon_{\max}$ , then

[TABLE]

Due to Theorem 7.7, we obtain an overall complexity of

[TABLE]

*We note that $\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}\sim\|\sigma_{L}^{*}\left(\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\right)\|_{W_{2}^{1}}\leq C_{\mathfrak{B}}h^{d/2-1}\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\|_{\ell_{2}},$ and since $d\geq 2$ , achieving $\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}<\epsilon_{\max}$ also requires only a fixed number of iterations. This is the statement of Eq. 42. *

Indeed, using $k$ steps of the Conjugate Gradient method on the original system Eq. 1, would give error $\|\bar{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|_{\boldsymbol{\mathbf{A}}_{L}}\leq\left(\frac{CN^{2/d}_{L}-1}{CN^{2/d}_{L}+1}\right)^{k}\|\boldsymbol{\mathbf{u}}^{\star}_{L}-\boldsymbol{\mathbf{u}}^{(0)}\|_{\boldsymbol{\mathbf{A}}_{L}}$ . Thus to ensure a tolerance of $\|\bar{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|_{\boldsymbol{\mathbf{A}}_{L}}\leq\epsilon_{\max}$ , one would need

[TABLE]

steps, where we use $\left(\frac{CN^{2/d}_{L}-1}{CN^{2/d}_{L}+1}\right)\sim\left(1-\tilde{C}N^{-2/d}\right)$ .

In contrast to this, the multigrid $W$ -cycle requires only

[TABLE]

iterations to achieve error $\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|_{\boldsymbol{\mathbf{A}}_{L}}\leq\epsilon_{\max}$ . In fact, it reaches $|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|_{\ell_{2}}\leq\epsilon_{\max}$ , which is a stronger constraint, within $k$ iterations. In particular, the number of iterations is independent of the size $N_{L}$ of the problem.

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] P. Collins , Kernel-Based Galerkin Methods on Compact Manifolds Without Boundary, with an Emphasis on SO(3) , Ph D thesis, University of Hawai’i at Manoa, available at https://scholarspace.manoa.hawaii.edu/items/936c 034f-5039-4ef 0-8f 03-03f 21d 71184 b , 2021.
2[2] G. Dziuk and C. M. Elliott , Finite element methods for surface PD Es , Acta Numer., 22 (2013), pp. 289–396, https://doi.org/10.1017/S 0962492913000056 . · doi ↗
3[3] E. Fuselier, T. Hangelbroek, F. J. Narcowich, J. D. Ward, and G. B. Wright , Localized bases for kernel spaces on the unit sphere , SIAM J. Numer. Anal., 51 (2013), pp. 2538–2562, https://doi.org/10.1137/120876940 . · doi ↗
4[4] T. Hangelbroek, F. J. Narcowich, C. Rieger, and J. D. Ward , Direct and inverse results on bounded domains for meshless methods via localized bases on manifolds , in Contemporary computational mathematics—a celebration of the 80th birthday of Ian Sloan. Vol. 1, 2, J. Dick, F. Kuo, and H. Woźniakowski, eds., Springer, Cham, 2018, pp. 517–543, https://doi.org/10.1007/978-3-319-72456-0_24 . · doi ↗
5[5] T. Hangelbroek, F. J. Narcowich, X. Sun, and J. D. Ward , Kernel approximation on manifolds II: the L ∞ subscript 𝐿 L_{\infty} norm of the L 2 subscript 𝐿 2 L_{2} projector , SIAM J. Math. Anal., 43 (2011), pp. 662–684, https://doi.org/10.1137/100795334 . · doi ↗
6[6] T. Hangelbroek, F. J. Narcowich, and J. D. Ward , Kernel approximation on manifolds I: bounding the Lebesgue constant , SIAM J. Math. Anal., 42 (2010), pp. 1732–1760, https://doi.org/10.1137/090769570 . · doi ↗
7[7] T. Hangelbroek, F. J. Narcowich, and J. D. Ward , Polyharmonic and related kernels on manifolds: interpolation and approximation , Found. Comput. Math., 12 (2012), pp. 625–670, https://doi.org/10.1007/s 10208-011-9113-5 . · doi ↗
8[8] V. John , Multigrid methods , tech. report, Lecture Notes available at https://www.wias-berlin.de/people/john/LEHRE/MULTIGRID/multigrid.pdf , 2014.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Kernel Multi-Grid on Manifolds††thanks: \fundingThomas Hangelbroek’s research was supported by grant DMS-2010051 from the National

Abstract

keywords:

1 Introduction

2 Problem Set Up

2.1 Manifold

2.2 Sobolev spaces

2.3 Elliptic operator

2.4 Galerkin methods

Lemma 2.1**.**

2.5 Kernel approximation

Lemma 2.2**.**

Proof 2.3**.**

3 The Lagrange basis and stiffness matrix

Assumption 1**.**

3.1 The stiffness matrix

Lemma 3.1**.**

Proof 3.2**.**

Lemma 3.3**.**

Proof 3.4**.**

3.2 The diagonal of the stiffness matrix

Lemma 3.5**.**

Proof 3.6**.**

Lemma 3.7**.**

Proof 3.8**.**

4 The smoothing property

4.1 L2L_{2}L2​ stability of the smoothing operator

Lemma 4.1**.**

Proof 4.2**.**

4.2 Smoothing properties

Lemma 4.3**.**

Proof 4.4**.**

Lemma 4.5**.**

Proof 4.6**.**

5 The direct kernel multigrid method

5.1 Setup: Grid transfer

Prolongation and restriction

Lemma 5.1**.**

Proof 5.2**.**

5.2 Multigrid iteration – two level case

Proposition 5.3**.**

Proof 5.4**.**

Corollary 5.5**.**

Proof 5.6**.**

5.3 Multigrid with τ\tauτ-cycle

Lemma 5.7**.**

Proof 5.8**.**

Theorem 5.9**.**

Proof 5.10**.**

Remark 5.11**.**

6 The perturbed multigrid method

Perturbed multigrid method

Lemma 6.1**.**

Proof 6.2**.**

Perturbed two grid method

Lemma 6.3**.**

Remark 6.4**.**

Proof 6.5** (Proof of Lemma 6.3).**

Perturbed τ\tauτ-cycle

Theorem 6.6**.**

Proof 6.7**.**

7 Truncated multigrid method

Lemma 7.1**.**

Proof 7.2**.**

Truncated stiffness matrix

Lemma 7.3**.**

Proof 7.4**.**

Truncated prolongation and restriction matrices

Lemma 7.5**.**

Proof 7.6**.**

Truncated τ\tauτ-cycle

Theorem 7.7**.**

Proof 7.8**.**

Lemma 2.1.

Lemma 2.2.

Proof 2.3.

Assumption 1.

Lemma 3.1.

Proof 3.2.

Lemma 3.3.

Proof 3.4.

Lemma 3.5.

Proof 3.6.

Lemma 3.7.

Proof 3.8.

4.1 $L_{2}$ stability of the smoothing operator

Lemma 4.1.

Proof 4.2.

Lemma 4.3.

Proof 4.4.

Lemma 4.5.

Proof 4.6.

Lemma 5.1.

Proof 5.2.

Proposition 5.3.

Proof 5.4.

Corollary 5.5.

Proof 5.6.

5.3 Multigrid with $\tau$ -cycle

Lemma 5.7.

Proof 5.8.

Theorem 5.9.

Proof 5.10.

Remark 5.11.

Lemma 6.1.

Proof 6.2.

Lemma 6.3.

Remark 6.4.

Proof 6.5 (Proof of Lemma 6.3).

Perturbed $\tau$ -cycle

Theorem 6.6.

Proof 6.7.

Lemma 7.1.

Proof 7.2.

Lemma 7.3.

Proof 7.4.

Lemma 7.5.

Proof 7.6.

Truncated $\tau$ -cycle

Theorem 7.7.

Proof 7.8.

Remark 7.9.