This paper introduces a multigrid method for efficiently solving kernel-based PDEs on surfaces, achieving scalable linear algebra solutions with rigorous convergence analysis.
Contribution
It develops a geometric multigrid framework for kernel PDEs on manifolds, addressing computational challenges and providing proven convergence rates.
Findings
01
Linear system solving cost scales log-linear with degrees of freedom
02
Multigrid method accelerates kernel PDE solutions on surfaces
03
Rigorous analysis confirms convergence rates
Abstract
Kernel methods for solving partial differential equations on surfaces have the advantage that those methods work intrinsically on the surface and yield high approximation rates if the solution to the partial differential equation is smooth enough. Localized Lagrange bases have proven to alleviate the computational complexity of usual kernel methods to some extent, although the efficient numerical solution of the ill-conditioned linear systems of equations arising from kernel-based Galerkin solutions to PDEs has not been addressed in the literature so far. In this article we apply the framework of the geometric multigrid method with a τ≥2-cycle to scattered, quasi-uniform point clouds on the surface. We show that the resulting linear algebra can be accelerated by using the Lagrange function decay, with convergence rates which are obtained by a rigorous analysis. In particular, we…
≤c2∥u−PVhu∥W21(M)∥u−v∥W21(M) for all v∈Vh.
∥u−PVhu∥W21(M)≤c1c2distW21(M)(u,Vh).
∥u−PVhu∥W21(M)≤c1c2distW21(M)(u,Vh).
∥u−PVhu∥L2(M)≤Chdist∥⋅∥W21(M)(u,Vh).
∥u−PVhu∥L2(M)≤Chdist∥⋅∥W21(M)(u,Vh).
VΞ:=⎩⎨⎧ξ∑a(ξ)ϕ(⋅,ξ)∣(∀p∈Π)ξ∈Ξ∑aξp(ξ)=0⎭⎬⎫+Π
VΞ:=⎩⎨⎧ξ∑a(ξ)ϕ(⋅,ξ)∣(∀p∈Π)ξ∈Ξ∑aξp(ξ)=0⎭⎬⎫+Π
distW2k(M)(u,VΞ)≤Chj−k∥u∥W2j(M).
distW2k(M)(u,VΞ)≤Chj−k∥u∥W2j(M).
∥IΞu−u∥W2k(M)≤Chm−k∥IΞu−u∥W2m(M)
∥IΞu−u∥W2k(M)≤Chm−k∥IΞu−u∥W2m(M)
∥IΞu−u∥W2k(M)≤Chm−k∥u∥W2m(M).
∥IΞu−u∥W2k(M)≤Chm−k∥u∥W2m(M).
K(u~,t)=g∈W2m(M)inf∥u~−g∥W2k(M)+t∥g∥W2m(M)
K(u~,t)=g∈W2m(M)inf∥u~−g∥W2k(M)+t∥g∥W2m(M)
∥u~−g∥W2k(M)
∥u~−g∥W2k(M)
∥u−PΞu∥L2(M)≤Ch∥u∥W21(M)
∥u−PΞu∥L2(M)≤Ch∥u∥W21(M)
∥χξ∥W2m(M∖B(ξ,R))≤Cenq2d−me−νhR.
∥χξ∥W2m(M∖B(ξ,R))≤Cenq2d−me−νhR.
∣χξ(x)∣≤Cpwe−νhdist(x,ξ)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Numerical Analysis Techniques · Advanced Numerical Methods in Computational Mathematics · 3D Shape Modeling and Analysis
Full text
Kernel Multi-Grid on Manifolds††thanks: \fundingThomas Hangelbroek’s research was supported by grant DMS-2010051 from the National
Science Foundation
Thomas Hangelbroek
Department of Mathematics, University of Hawai‘i – Mānoa,
Honolulu, HI 96822, USA. ().
[email protected]
Christian Rieger
Philipps-Universität Marburg,
Department of Mathematics and Computer Science,
Hans-Meerwein-Straße 6, 35032 Marburg ()
[email protected]
Abstract
Kernel methods for solving partial differential equations on surfaces have the advantage that those methods work intrinsically on the surface and yield high approximation rates if the solution to the partial differential equation is smooth enough.
Naive implementations of kernel based methods suffer, however, from the cubic complexity in the degrees of freedom.
Localized Lagrange bases have proven to overcome this computational complexity to some extent.
In this article we present a rigorous proof for a geometric multigrid method with τ≥2-cycle for elliptic partial differential equations on surfaces which is based on precomputed Lagrange basis functions.
Our new multigrid provably works on quasi-uniform point clouds on the surface and hence does not require a grid-structure.
Moreover, the computational cost scales log-linear in the degrees of freedom.
The numerical solution to partial differential equations on curved geometries is crucial to many real-world applications.
Of the many numerical methods (see for instance [2] for an overview using finite elements,
or [13] [10] for alternative mesh-free methods), we focus on
mesh-free kernel-based Galerkin methods, which have a number of merits, including delivering high approximation orders for smooth data,
providing smooth solutions, and working coordinate-free and without the need for rigid underlying geometric structures like meshes and regular grids.
The major drawback of plain kernel methods is however their prohibitively high numerical costs.
In this paper we address the problem of reducing the computational costs of using kernels without spoiling their analytic advantages.
It has been noticed that in many cases, there is a localized basis for kernel based trial spaces (essentially a Lagrange-type basis or
nodal basis in the finite element context).
We use the localized basis to provide a rigorous analysis of classical multi-grid methods,
precisely the W-cycle for second order linear elliptic differential equations on compact Riemannian manifolds without boundary.
Although multi-grid methods recently have gained attention also in the mesh-less community, see [9] and [16],
the rigorous analysis we provide has been missing in the kernel-based context.
Throughout this article,
we will assume to have access to a sequence of quasi-uniform refined point clouds on the manifold, M and its corresponding Lagrange basis.
Both, the construction of point clouds on manifolds and the computation of the Lagrange basis are independent of the partial differential equation
and can hence be pre-computed and even be stored.
Precisely, we will assume to have a nested sequence of sets Ξ0⊂Ξ1⊂…ΞL⊂M and we denote the finest
kernel-based Galerkin problem by
[TABLE]
where AL (please refer to Eq. 18 for a precise definition) is the matrix which represents the elliptic operator L
on a finite dimensional kernel space expressed in that Lagrange basis, as considered in [3] or [4].
The corresponding approximation takes the form uL∈VΞL (for a definition of the space, please refer to Eq. 11).
We will stick to the convention that boldface letters denote quantities from linear algebra whereas the non-boldface letters denote functions.
We point out that Eq. 1 is usually a densely populated linear system and the
condition number cond2(AL)∼NL2/d is growing with the problem size.
Thus iterative methods cannot be expected to work well on this linear system, as the iterations needed to ensure
a prescribed accuracy will grow with the number of degrees of freedom.
The main novelty of this paper is that we define suitable smoothing, restriction and prolongation operators and
define a mesh-free multi-grid algorithm based on those choices which can be shown to be a contraction with grid-size independent norm, see Theorem 5.9.
Furthermore, by carefully truncating the various rapidly decaying matrices (i.e., stiffness,
restriction and prolongation matrices) which appear in the algorithm, we
have an approximate solution to Eq. 1) which requires only \mathcal{O}\bigl{(}N_{L}\log(N_{L})^{d}\bigr{)}
floating point operations per iteration and where again the multi-grid matrix has a norm bound less than unity (see Theorem 6.6), for
a total operation count of
[TABLE]
to achieve error
∥u−uL⋆∥ℓ2≤ϵ+Clog(NL)dNL−J, where J>0 is a user-determined constant which depends linearly on the sparsity of truncated matrices in the algorithm, see Remark 7.9.
A final application of the triangle inequality also implies an error bound for the function reconstruction of the true solution to the partial differential equation by its numerically obtained approximations.
The remainder of this article is organized as follows.
In Section 2, we introduce the basic notation of second order elliptic equations on manifolds, and their
solution via kernel-based Galerkin approximation. In this section we demonstrate the approximation property
in the kernel context, which, along with the smoothing property, provides the analytic backbone for the success of the multi-grid method.
Section 3 introduces the phenomenon of rapidly decaying Lagrange-type bases, which
holds for certain kernels – using such bases permits stiffness matrices with rapid off-diagonal decay, among other things.
Of special importance is the diagonal behavior of the stiffness matrix given in Lemma 3.5, which is a novel contribution of this paper,
and which is the analytic result necessary to prove the smoothing property.
In Section 4 we discuss the smoothing property of damped Jacobi iterations for kernel-based methods both in the case of symmetric
and non-symmetric differential operators.
In Section 5 we introduce kernel-based restriction and prolongation operators, the classical two-grid method
and then the W-cycle. In this section, we prove Theorem 5.9, a consequence of which is Eq. 42.
Section 6 treats the error resulting from small perturbations of the stiffness matrix, as well as the prolongation and restriction matrices.
The main result in this section, Theorem 6.6, demonstrates how such errors affect the multi-grid approximation error.
Section 7 investigates the computationally efficient truncated multi-grid method as an application of the previous section.
2 Problem Set Up
2.1 Manifold
Consider a compact Riemannian manifold without boundary M. Here we list
some useful tools and their properties which hold in this setting; the relevant results are all taken from [6].
The metric tensor at x∈M is
denoted g(x). For tangent vectors in V,W∈TxM
we have
⟨V,W⟩g,x=∑VjWkgj,k.
For cotangent vectors μ,ν∈Tx∗M we have ⟨μ,ν⟩g,x=∑μjνkgj,k, where
∑gj,kgk,ℓ=δj,ℓ.
The Riemannian metric gives rise to a volume form dμ=det(g(x))dx, and by compactness, there exist
constants 0<αM≤βM so that
[TABLE]
for any x∈M and 0<r<diam(M).
Because M is a metric space, for a finite subset Ξ⊂M, we can define the following useful quantities:
Define the
the separation distance, q of Ξ
and the
fill distance, h, of Ξ in M
as
[TABLE]
The finite sets considered throughout this paper will be quasiuniform, with mesh ratio
ρ:=h/q bounded by a fixed constant. For this reason, quantities which are controlled above or below
by a power of q can be likewise controlled by a power of h – in short, whenever
possible, we express estimates in terms of the fill distance
h, allowing constants to depend on ρ.
For instance, the cardinality #Ξ is bounded above and below by
[TABLE]
Similarly, if f:[0,∞)→R is continuous then for any x∈M,
[TABLE]
2.2 Sobolev spaces
There is a natural inner product on each fiber of the tensor bundle TsrM.
To define Sobolev spaces, we will
be concerned with covariant tensor fields: sections of TkM.
Given vector fields T,S∈TkM,
written in coordinates as
T=∑j∈{1,…,d}kTjdxj1…dxjk
and S=∑j∈{1,…,d}kSjdxj1…dxjk,
at each point x∈M we have
[TABLE]
The covariant derivative ∇
maps tensor fields of rank (r,s) to fields of rank (r,s+1). Its adjoint (with respect to the L2 inner products
on the space of sections of Trs(M)) is denoted ∇∗.
For functions, this is fairly elementary. The covariant derivative of a function f:M→R
equals its exterior derivative; in coordinates, we have
[TABLE]
For a 1-form ω=∑ωjdxj, we have
[TABLE]
The Sobolev space Wpk(Ω) is defined to be the set of functions f:Ω→R which satisfy
[TABLE]
Lemma 3.2 from [6] applies, so there are uniform constants Γ1,Γ2 and r>0
so that the family of exponential maps
{Expx:Br→M∣x∈M} (which are diffeomorphisms taking [math] to x)
provides local metric equivalences: for any open set Ω⊂Br, we have
[TABLE]
This shows that Wpk(M) can be endowed with equivalent norms using a partition of unity (φj)j≤N
subordinate to a cover {Oj}j≤N with associated charts ψj:Oj→Rd to obtain
∥u∥Wpk(M)p∼∑j=1N∥(φju)∘ψj−1∥Wpk(Rd)p. Here constants of equivalence
depend on the partition of unity and charts.
A useful result in this setting, which we will use explicitly in this article, but is also behind a number
of background results in Section 2.5,
is the following zeros estimate [7, Corollary A.13 ],
which holds for Sobolev spaces:
If u∈W2m(M) satisfies u∣X=0, then
[TABLE]
The result can also be obtained on bounded, Lipschitz domain Ω⊂M
that satisfies a uniform cone condition, with the cone having radius RΩ≤rM/3, see [7, Theorem A.11 ]
for a precise statement and definitions of the involved quantities.
2.3 Elliptic operator
We write the operator L in divergence form:
[TABLE]
Here \textsca0 is a smooth function, \textsca1 is a smooth tensor field of type (1,0)
and \textsca2 is a smooth tensor field of type (2,0)
which generates the field
\textsca2♭ type (1,1).
In coordinates, \textsca2 has the form ajk∂xj∂∂xk∂,
and \textsca2♭ is akjdxk∂xj∂ with akj=∑ℓgkℓaℓj.
Furthermore, we require c0>0 to be a constant
so that
[TABLE]
The latter ensures that
[TABLE]
Thus, Eq. 6
guarantees that the bilinear form a(u,v):=∫vLu,
defined initially for smooth functions,
is bounded on W21(M) and
is coercive.
Thus, we have that the energy norm ∥u∥L2:=a(u,u) satisfies the metric equivalence
[TABLE]
2.4 Galerkin methods
We fix an f∈L2(M) and consider u∈W21(M) as solution to
[TABLE]
Regularity estimates ([14, Chapter 5.11, Theorem 11.1]) yield that
u∈W22(M)
and
∥u∥W22(M)≤C∥f∥L2(M).
Consider now a family of finite dimensional subspaces (Vh) with
Vh⊂W21(M)
associated to a parameter111In the
sequel, these will be kernel spaces generated by a subset Ξ⊂M, and h=h(Ξ,M) will be the fill distance.
For now, we consider a more abstract setting, with h>0 only playing a role in establishing the approximation
property distW21(g,Vh)≤Ch∥g∥W22(M) below. h>0.
Define PVh:W21(M)→Vh so that
[TABLE]
The classical Céa Lemma yields
[TABLE]
And thus we get
[TABLE]
This can be improved by a Nitsche-type argument to get the following result, whose proof can be found in many textbooks on numerical methods for partial differential equations.
Lemma 2.1**.**
Suppose the family (Vh) has the property that
for all u~∈W22(M),
the distance
in W21(M)
from Vh satisfies
dist∥⋅∥W21(M)(u~,Vh)≤Ch∥u~∥W22(M).
Then for any
u∈W22(M),
[TABLE]
We point out, that Lemma 2.1 does not depend on the choice of a specific basis in the finite dimensional space Vh.
2.5 Kernel approximation
We consider a continuous
function ϕ:M2→R, the kernel,
which satisfies a number of analytic properties which we explain in this section.
Most important is that k is conditionally positive definite with respect to some
(possibly trivial) finite
dimensional subspace222generally a space spanned by some
eigenfunctions of the Laplace-Beltrami operator Π⊂C∞(M).
Conditional positive definiteness with respect to Π means that for any Ξ⊂M,
the collocation matrix \bigl{(}\phi(\xi,\zeta)\bigr{)}_{\xi,\zeta} is positive definite on the vector
space {a∈RΞ∣(∀p∈Π)∑ξ∈Ξaξp(ξ)=0}.
As a result,
if Ξ separates elements of Π, then the space
[TABLE]
has dimension #Ξ.
In case Π={0}, the kernel is positive definite, and the collocation matrix is strictly positive definite on RΞ,
and VΞ=spanξ∈Ξϕ(⋅,ξ).
For a conditionally positive definite kernel, there is an associated reproducing kernel semi-Hilbert space
N(k)⊂C(M) with the property that Π=null(∥⋅∥N) and
that for any a∈RΞ for which ∑ξ∈Ξaξδξ⊥Π, the identity
∑aξf(ξ)=⟨f,∑aξϕ(⋅,ξ)⟩N holds for all f∈N.
It follows that if Ξ separates elements of Π, interpolation with VΞ is well defined on Ξ
and the projection IΞ:N→VΞ is orthogonal with respect to the semi-norm on N.
Of special interest are (conditionally) positive definite kernels have Sobolev native spaces.
Lemma 2.2**.**
If ϕ is conditionally positive definite with respect to Π
and satisfies the equivalence N/Π≅W2m(M)/Π.
then there is a constant C so that for
any integers k,j with 0≤k≤j≤m
and any u∈W2j(M),
[TABLE]
Proof 2.3**.**
The case k=j is trivial; it follows by considering 0∈VΞ.
For j=m and 0≤k<m,
the zeros estimate [7, Corollary A.13]
ensures
that
[TABLE]
(because the interpolation error IΞf−f vanishes on Ξ).
The hypothesis then gives,
∥IΞu−u∥W2k(M)≤Chm−k∥IΞu−u∥N,
with a suitably enlarged constant.
Because IΞ is an orthogonal projector, ∥IΞu−u∥N≤∥u∥N,
so we have (again enlarging the constant)
that
[TABLE]
In case 0≤k<j<m, we use the fact that
W2j(M)=[W2k(M),W2m(M)]m−kj−k,2.
This is [15, Theorem 5].
For u~∈W2j(M),
this means that the K-functional
[TABLE]
satisfies the condition
∫0∞(t−m−kj−kK(u~,t))2tdt<∞.
Since K(u~,t) is continuous and monotone, we have that
t↦t−m−kj−kK(u~,t) is bounded on (0,∞).
Thus for t=hm−k, there exists g∈W2m(M) so that
[TABLE]
*Let s=IΞg.
The above estimate gives
∥IΞg−g∥W2k(M)≤Chm−k∥g∥W2m(M).
This implies that ∥u~−IΞg∥W2k(M)≤Chj−k∥u~∥W2j(M)
as desired.
*
As a consequence, the kernel Galerkin solution uΞ∈VΞ
to Lu=f with f∈W2(M) satisfies
∥u−uΞ∥W21(M)≤Chj−1∥f∥W2j+1(M).
More importantly, for our purposes, the hypothesis of Lemma 2.1 is satisfied by
the space VΞ.
Indeed, by Eq. 10, we have the following approximation property:
[TABLE]
which, together with a smoothing property, forms the backbone of the convergence theory for
the multigrid method.
3 The Lagrange basis and stiffness matrix
For a kernel ϕ and a set Ξ which separates points of Π, the Lagrange basis
(χξ)ξ∈Ξ for VΞ
satisfies χξ(ζ)=δξ,ζ for each ξ,ζ∈Ξ.
A natural consequence of N≅W2m(M) is that
there exists a constant C so that for any Ξ⊂M,
the bound ∥χξ∥W2m(M)≤Cq2d−m holds.
A stronger result is the following:
Assumption 1**.**
We assume that there is m>d/2+1 so that N≅W2m(M),
and, furthermore,
there exist constants ν>0 and Cen so that for
R>0
[TABLE]
This gives rise to a number of analytic properties, some of which we present here
(there are many more, see [5] and [4] for a detailed discussions).
For the following estimates, the constants of equivalence depend on
Cen, ν, and ρ.
Pointwise decay: there exist constants C and ν>0 so that for any Ξ⊂M, the estimate
[TABLE]
holds.
Here Cpw≤ρm−d/2Cen, where we recall that
the mesh ratio is
ρ=h/q.
Hölder continuity: Of later importance, we mention the following condition, which follows from [7, Corollary A.15].
For any ϵ<m−d/2, the Lagrange function is ϵ Hölder continuous, and
satisfies the bound
[TABLE]
Although we make explicit dependence on it here, for the
remainder of this article, we assume constants to depend on ρ.
Riesz property: there exist constants 0<C1≤C2<∞ so that for any Ξ⊂M, and a∈RΞ, we have
[TABLE]
Bernstein inequalities: There is a constant CB so that for 0≤k≤m,
[TABLE]
3.1 The stiffness matrix
We now discuss the stiffness matrix and some of its properties. Most of these have appeared in
[1], with earlier versions for the sphere appearing in [6].
A consequence of the results of this section is
that the problem of calculating the Galerkin solution to Lu=f from VΞ
involves treating a problem whose condition number grows like O(h−2) – this
is the fundamental issue that the multi-grid method seeks to overcome.
The analysis map for
\bigl{(}\chi_{\xi}\bigr{)}_{\xi\in\Xi_{\ell}} with respect to the
bilinear form a is
[TABLE]
The analysis map is a surjection.
The synthesis map is
[TABLE]
The range of the synthesis map is clearly VΞ;
in other words, it is the natural isomorphism between Euclidean space and the finite dimensional kernel space; indeed,
Eq. 15 shows that it is bounded above and below between L2(M) and ℓ2(Ξ).
By abusing notation slightly,
we write \bigl{(}\sigma^{\ast}_{\Xi}\bigr{)}^{-1}:V_{\Xi}\to\mathbb{R}^{\Xi}.
This permits a direct matrix representation of linear operators on VΞ via conjugation:
S↦S:=(σΞ∗)−1SσΞ∗∈RΞ×Ξ. Furthermore, by the Riesz property Eq. 15, we have
[TABLE]
A simple calculation shows that
a(σΞ∗(w),v))=⟨w,σΞ(v)⟩2,
so σΞ∗ is the a-adjoint of σΞ.
Of course, when a is symmetric, we also have
⟨σℓ(v),w⟩2=a(v,σℓ∗(w)).
The stiffness matrix is defined as
[TABLE]
It represents the operator L
on the finite dimensional space VΞ.
Using the analysis and synthesis maps,
AΞ=σΞ∘σΞ∗:(RΞ,(⋅,⋅)2)→(RΞ,(⋅,⋅)2):c↦(a(χξ,χζ))ξ,ζ∈Ξc,
and the Galerkin projector
PΞ satisfies PΞ=σΞ∗(σΞ∘σΞ∗)−1σΞ:W21(M)→VΞ.
Lemma 3.1**.**
There is a constant Cstiff so that the entries of the stiffness matrix satisfy
[TABLE]
Proof 3.2**.**
We can bound the integral
[TABLE]
Decompose each inner product using the half spaces H+:={x∣dist(x,ξ)<dist(x,η)} and H−=M∖H+, noting
H+⊂M∖B(η,R) and H−⊂M∖B(ξ,R), with R=dist(ξ,η)/2.
By applying Cauchy-Schwartz to
each integral gives, after combining terms,
[TABLE]
for some constant CL depending on the coefficients of L.
We have
∥χξ∥W21(M)≤CBhd/2−1
by the Bernstein inequality Eq. 16 (with a similar estimate for χη).
The zeros estimate for complements of balls,
[7, Corollary A.17],
applied to χξ gives
∥χξ∥W21(M∖B(ξ,R))≤CZhm−1∥χξ∥W2m(M∖B(ξ,R))
(with a similar estimate for χη). Thus we have
[TABLE]
*The lemma follows with Cstiff=2CLCBCZCenρm−d/2.
*
By considering row and column sums, that
[TABLE]
with CA=Cstiff(1+∑n=1∞(n+2)de−2ρνn).
Lemma 3.3**.**
Under Assumptions 1, for Ξ⊂M, the stiffness matrix satisfies
[TABLE]
*with a constant Cinv which is independent of Ξ.
*
Proof 3.4**.**
Coercivity ensures
∣a(∑ξ∈Ξvξχξ,∑ξ∈Ξvξχξ)∣≥c0∑ξ∈ΞvξχξW21(M)2,
and the metric equivalence ∥v∥W21(M)2=∥(1−Δ)1/2v∥L2(M)
gives
[TABLE]
is the stiffness matrix for the self-adjoint operator
1−Δ.
Because σ(1−Δ)⊂[1,∞),
we conclude that
v⋅Lv≥∑ξ∈ΞvξχξL2(M)2≥C1hd∥v∥ℓ2(Ξ)2
by the Riesz property Eq. 15.
Hence, overall we get
[TABLE]
*with c=c0C1,
so ∥AΞ−1∥ℓ2→ℓ2≤c0C11h−d.
*
Consequently, the ℓ2 condition number of AΞ is bounded by a multiple of h−2.
3.2 The diagonal of the stiffness matrix
As a counterpart to the off-diagonal decay given in Lemma 3.1, we can give the following
lower bounds on the diagonal entries.
Lemma 3.5**.**
For an elliptic operator L and a kernel satisfying Assumption 1, for a mesh ratio ρ,
there is Cdiag>0 so that for any Ξ⊂M with h/q<ρ, and any ξ∈Ξ,
we have
[TABLE]
Proof 3.6**.**
By coercivity of a, it suffices to prove
that
∥∇χξ∥L2(B(ξ,h))2≳hd−2.
We begin by establishing a Poincaré-type inequality which is valid for smooth Lagrange functions.
To this end, consider f:B(0,rM)→R obtained by the change of variable f(x)=χξ(expξ(x)).
Note that for B⊂B(ξ,rM), we have
for any 0≤k≤m, that
[TABLE]
by the zeros estimate, with C1:=Γ11ρm−d/2CenCZ, where the constants stem from Eq. 4.
Now let r:=dist(ξ,Ξ∖{ξ}), and define F:B(0,1)→R by
F:=f(r⋅). Then, by a change of variable,
[TABLE]
with C2:=ρdC1, since h/ρ≤q≤r≤h.
Because of the embedding W2m(B(0,1))⊂C(B(0,1))
(which holds since m>d/2),
the set K:={G∈W2m(B(0,1))∣∥G∥W2m(B(0,1))≤C2,G(0)=1,G(e1)=0}
is well defined, closed and convex, hence weakly compact, by Banach-Alaoglu.
The natural embedding e:W2m(B(0,1))→W21(B(0,1)),
is compact. We wish to show that eK is a compact set.
Because e is a continuous linear map, it is continuous between the weak topologies
of W2m(B(0,1)) and W21(B(0,1)). Thus e(K) is weakly compact in W21(B(0,1)),
and thus norm closed. Finally, because e(K) is complete and totally bounded,
it is a compact subset in the norm topology of W21(B(0,1)).
Consider the (possibly zero) constant
c defined by
[TABLE]
The map I:W21(B(0,1))→R:G↦∥G∥L2∥∇G∥L2
is continuous on the complement of 0∈W21(B(0,1))
(as quotient of two continuous functions that do not vanish).
In particular, it is continuous and non-vanishing on e(K),
so c=minG∈e(K)I(G)>0.
Indeed, I(G)>0 for all G∈e(K), since G(0)=1, G(e1)=0 and G∈C1(B).
Note that in the above minimization problem,
the condition G(e1)=0 could be replaced by any other point on the unit circle
without changing the value of c.
By rotation invariance of the W2m(Rd) norm, it follows that ∥∇F∥L2≥c∥F∥L2. Finally,
employing the change of variables rd−2∥∇F∥L2(B(0,1))2=∥∇f∥L2(B(0,r))2
and ∥F∥L2(B(0,1))2=rd∥f∥L2(B(0,r))2, we have
[TABLE]
The last line follows because χξ is Hölder continuous, so stays close to 1 near ξ.
Specifically, by Eq. 14, for κ:=(2CH)−1/ϵ
we have χξ(x)>21 for all x∈B(ξ,κh).
Thus
[TABLE]
*The lemma follows with constant Cdiag=41c2Γ12ρ−2αMκd.
*
This brings us to the lower bound for diagonal entries of the stiffness matrix. Define the diagonal of AΞ as
BΞ:=diag(AΞ),
and note that by Lemma 3.1 and Lemma 3.5,
[TABLE]
is bounded above by a constant which depends only on the mesh ratio ρ
(and not on Ξ).
This permits us to find suitable damping constants 0<θ<1 so that
BΞ dominates θAΞ.
This drives
the success of the damped Jacobi method
considered in the next section.
Lemma 3.7**.**
*For an elliptic operator L, a kernel ϕ satisfying Assumption 1, and mesh ratio ρ, there is θ∈(0,1) so that for any point set
Ξ⊂M,
θ⟨AΞv,v⟩≤⟨BΞv,v⟩ for all v∈RΞ.
*
Proof 3.8**.**
*By Eq. 19, ⟨AΞv,v⟩≤CAhd−2∥v∥2, while
Cdiaghd−2∥v∥2≤⟨BΞv,x⟩
follows from Lemma 3.5. Thus the lemma holds for any θ in the interval (0,Cdiag/CA].
*
4 The smoothing property
In this section we define and study the smoothing operator used in the multigrid method.
We focus on the damped Jacobi method for the linear system AΞuΞ⋆=b,
where AΞ∈R∣Ξ∣×∣Ξ∣ and b∈R∣Ξ∣.
For a fixed damping parameter 0<θ<1,
u(j) is approximately computed via the iteration:
[TABLE]
with starting value u(0)∈R∣Ξ∣.
Define the affine map governing a single iteration as
[TABLE]
For k≥1, the associated damped Jacobi iteration is
JΞk+1(u,b):=JΞ(JΞk(u,b),b).
For a starting value u0, we have
u(k)=JΞk(u(0),b).
Since u(k)−u⋆=(id−θBΞ−1AΞ)(u(k−1)−u⋆), we have the identity
u(k)−u⋆=(id−θBΞ−1AΞ)k(u(0)−u⋆).
This shows that the iteration converges if and only if
[TABLE]
called the smoothing matrix,
is a contraction.
This gives a recursive form for the error:
[TABLE]
In the context of kernel approximation,
the corresponding operator,
the smoothing operator,
SΞ:VΞ→VΞ
is defined
as
SΞσΞ∗:=σΞ∗SΞ.
The success of the multi-grid method relies on a smoothing property, which for our
purposes states that iterating SΞ is eventually contracting:
∥SΞν∥L2(M)→L≤Ch−1o(ν) as ν→∞. This
smoothing property is demonstrated in Section 4.2.
4.1 L2 stability of the smoothing operator
At this point, we are in a position to show that that iterating this operator is stable on L2(M).
To help analyze the matrix SΞ=id−θBΞ−1AΞ,
we introduce the inner product
[TABLE]
Since BΞ is diagonal, and its diagonal entries are
⟨χξ,χξ⟩L∼hd−2, we have
the norm equivalence ∥M∥BΞ→BΞ∼∥M∥2→2. Specifically, we have
[TABLE]
with constant of equivalence
CB=κ(BΞ1/2)=mina(χξ,χξ)maxa(χξ,χξ)≤Cstiff/Cdiag.
Lemma 4.1**.**
For the damping parameter θ in Lemma 3.7, there is C>0 so that
for all n, u∈RΞ and u=σΞ∗u∈VΞ,
[TABLE]
We note that this holds even when a is non-symmetric.
Proof 4.2**.**
The matrix
BΞ1/2(id−θBΞ−1AΞ)BΞ−1/2
is symmetric, and has the same spectrum
as id−θBΞ−1AΞ.
Furthermore, because
0≤⟨BΞ−θAΞx,x⟩≤⟨BΞx,x⟩,
the spectral radius of SΞ is no greater than 1.
Thus
[TABLE]
*It follows that ∥SΞn∥BΞ→BΞ≤1 as well, and the matrix norm equivalence
Eq. 22 guarantees that ∥SΞn∥2→2≤κ(BΞ1/2).
The second inequality follows from the Riesz property Eq. 15.
*
4.2 Smoothing properties
For v=σΞ∗v∈VΞ, we have ∥v∥L=a(v,v)=⟨AΞv,v⟩.
By applying Cauchy-Schwarz, the Riesz property and Lemma 4.1, the chain of inequalities
[TABLE]
holds. In the case that a is symmetric, we have the following smoothness property.
Lemma 4.3**.**
For a given ρ>0 there is a constant C so that for any θ chosen as in Lemma 3.7, and Ξ⊂M with mesh ratio
ρ(Ξ)≤ρ, the damped Jacobi iteration has smoothing operator SΞ∈L(VΞ) which satisfies
[TABLE]
Proof 4.4**.**
*This is a result of [12, Theorem 7.9], which shows that
∥AΞSΞnv∥ℓ2≤θ(ν+1)Chd−2∥v∥ℓ2.
It follows that ∥AΞSΞnv∥ℓ2≤θ(n+1)Chd/2−2∥v∥L2, and the result holds by the above discussion.
*
When a is not symmetric, we have the smoothness property.
Lemma 4.5**.**
Let SΞ:VΞ→VΞ:v=∑vξχξ↦∑(SΞv)ξχξ.
For θ as in Lemma 3.7 we have
[TABLE]
Proof 4.6**.**
*This follows from [12, Theorem 7.17], and by techniques of the proof of Lemma 4.3.
*
5 The direct kernel multigrid method
We are now in a position to consider the multigrid method applied to the kernel based Galerkin method
we have described in the previous sections.
5.1 Setup: Grid transfer
We consider a nested sequence of point sets
[TABLE]
and associated kernel spaces VΞℓ as described in Section 2.5.
We denote the Lagrange basis for each such space
(χξ(ℓ))ξ∈Ξℓ,
and with it the accompanying analysis map σℓ:=σΞℓ,
synthesis map σℓ∗:=σΞℓ∗ and stiffness matrix Aℓ:=AΞℓ.
Moreover, we assume that there are constants 0<γ1≤γ2<1 and ρ≥1 such that
[TABLE]
Note at this point that ρ is a universal constant.
Hence ρ does not depend on ℓ and thus the constants in deriving the smoothing property do not depend on ℓ.
Thus, we obtain nℓ∼hΞℓ−d, for constants see Eq. 2.
We will assume that L is the largest index that we will consider.
In this section, we discuss grid transfer: specifically, the operators and matrices which provide communication
between finite dimensional
kernel spaces. These include natural prolongation and restriction maps and their corresponding matrices.
We show how these can be used to relate Galerkin projectors PΞℓ and stiffness matrices Aℓ.
Prolongation and restriction
Denote the Lagrange basis of VΞℓ by (χξ(ℓ)),
and note that by containment VΞℓ−1⊂VΞℓ, it follows that
χξ(ℓ−1)=∑η∈Ξℓβξ,ηχη(ℓ)
holds for some matrix of coefficients βξ,ℓ.
Furthermore, from the Lagrange property, the identity
χξ(ℓ−1)(η)=∑ζ∈Ξℓχξ(ℓ−1)(ζ)χζ(ℓ)(η).
holds for any η∈Ξℓ−1.
By uniqueness, we deduce that βξ,η=χξ(ℓ−1)(η),
and we have
[TABLE]
This yields that the natural injection Iℓ−1(ℓ):VΞℓ−1→VΞℓ,
called the prolongation map, which is described by the rectangular matrix
pℓ:=(χξ(ℓ−1)(η))ξ∈Ξℓ−1,η∈Ξℓ=(σℓ∗)−1σℓ−1∗∈Rnℓ×nℓ−1.
It is worth noting that Iℓ−1ℓσℓ−1∗=σℓ−1∗, so we have
the identity
[TABLE]
The corresponding *restriction map *
Iℓ(ℓ−1):VΞℓ→VΞℓ−1
is described by the transposed matrix rℓ=(pℓ)T.
In other words, it is defined as
Iℓ(ℓ−1)σℓ∗=σℓ−1∗(pℓ)T.
Note that we can use rℓ=(pℓ)T to relate analysis maps at different levels, since
we can take the a-adjoint of both sides of the equation
Eq. 24
to obtain the following useful identity:
[TABLE]
Moreover, the prolongation is both bounded from above and below. This is a kernel based analogue for [12, Eq. (64)].
Lemma 5.1**.**
Using the notation from above, there
is a constant Cpro≥1
depending on γ, ρ, M and the constants in Assumption 2 so that
[TABLE]
*holds for all ℓ≥1.
*
Proof 5.2**.**
We begin by estimating the ℓ1(Ξℓ−1)→ℓ1(Ξℓ) and ℓ∞(Ξℓ−1)→ℓ∞(Ξℓ) norms of
pℓ by taking column and row sums, respectively. These estimates can be made almost simultaneously, because
the (ξ,η) entry of pℓ satisfies the bound
∣χξ(ℓ−1)(η)∣≤Cpwexp(−νhℓ−1dist(ξ,η))
by Eq. 13.
Let A:=∑n=1∞(n+2)dexp(−ρνn) and B:=∑n=1∞(n+2)dexp(−ρνγn),
and note that both numbers depend on ρ,γ and the exponential decay rate ν from Eq. 13.
Applying Eq. 3, gives
[TABLE]
Finally, interpolation gives an upper bound,
with Cpro:=Cpw(1+αMβMB)(1+αMβMA).
*By using the definition Υℓ:=Ξℓ∖Ξℓ−1
we write pa=(χξ(η))ξ,η∈Ξℓ−1 and
pb=(χξ(ζ))ξ∈Ξℓ−1,ζ∈Υℓ.
Thus, we have
∥pℓc∥22=∥c∥22+∥pbc~∥22≥∥c∥22,
which gives the lower bound.
*
5.2 Multigrid iteration – two level case
We now describe the multigrid algorithm, which is a composition of smoothing operators, restriction, coarse grid correction,
prolongation, and then smoothing.
We begin by considering the solution of Aℓuℓ⋆=bℓ,
where Aℓ is the
stiffness matrix associated to VΞℓ,
and where
bℓ=σℓuℓ⋆,
is the data obtained from the Galerkin solution
uℓ⋆=σℓ∗uℓ⋆∈VΞℓ.
Naturally, uℓ⋆ is unknown (its coefficients are the solution of the above problem),
but
we can compute the data
σℓuℓ⋆ via
[TABLE]
In other words, it is obtained from the right hand side f.
The output uℓnew=TGMℓ(uℓold,bℓ)
of the two-level multigrid algorithm
with initial input uℓold is given by the rather complicated formula
[TABLE]
This can be simplified if we make an affine correction: by considering the error
uℓnew−uℓ⋆.
We use Eq. 21 to express the affine
smoothing operators in terms of a matrix product.
We can write
bℓ−AℓJℓν1(uℓold,bℓ)
as the expression
Aℓ(uℓ⋆−Jℓν1(uℓold,bℓ)), and
then, by employing Eq. 21 again, we have
−Aℓ(Sℓν1(uℓold−uℓ⋆)).
Now we can rewrite uℓnew as
[TABLE]
As in [12, Eq. 48], by using the two grid iteration matrix
[TABLE]
the error can be expressed as
uℓnew−uℓ⋆=TGMℓ(uℓold)−uℓ⋆=CTGℓ(uℓold−uℓ⋆).
The corresponding operator on VΞℓ
is obtained by conjugating with σℓ∗. This gives the
error operator for the two level method CTGℓσℓ∗:=σℓ∗CTGℓ.
It is worth noting that by Eq. 24 we have the equality
σℓ∗pℓ(Aℓ−1)−1rℓAℓ=σℓ−1∗(Aℓ−1)−1rℓAℓ.
Using the identity Aℓ=σℓσℓ∗
followed by
25
and the identity
PΞℓ−1=σℓ−1∗(Aℓ−1)−1σℓ−1
gives
[TABLE]
It follows that
CTGℓ=Sℓν2(idVΞℓ−1−PΞℓ−1)Sℓν1.
We are now in a position to show that the two level method is a contraction for sufficiently large values of ν1.
Proposition 5.3**.**
There is a constant C so that for all ℓ,
CTGℓsatisfies the bound
[TABLE]
Proof 5.4**.**
We have that
∥CTGℓ∥=Sℓν2(idVΞℓ−1−PΞℓ−1)Sℓν1
holds,
so Lemma 4.1 ensures
[TABLE]
By Lemma 2.1,
idVΞℓ−1−PΞℓ−1W21(M)→L2(M)≤Chℓ−1
holds, so
it follows that
[TABLE]
*By coercivity, this gives ∥CTGℓ∥L2(M)→L2(M)≤Chℓ−1∥Sℓν1v∥L2(M)→L.
Finally, the result follows by applying the smoothing property: Lemma 4.3 in the symmetric case and Lemma 4.5 in the non-symmetric case.
*
Corollary 5.5**.**
Let θ as in Lemma 3.7 and let uℓold∈Rnℓ be an initial guess,
uℓold=σℓ∗(uℓold),
uℓ=TGMℓ(uℓold),
and uℓ=σℓ∗uℓ.
If a is symmetric, there is a constant C independent of ℓ and θ so
that
[TABLE]
If a is not symmetric,
there is a constant C independent of ℓ and θ so
that
In the two-grid method, the computational bottleneck remains the solution on the coarse grid.
Thus, there have been many approaches to recursively apply the multi-grid philosophy in order to use a direct solver only on the coarsest grid.
A flexible algorithm is the so-called τ-cycle.
Here τ=1 stands for the V-cycle in multi-grid methods and τ=2 gives the W-cycle.
Our results hold for τ≥2.
Before proving our main theorem, we need a statement from elementary real analysis.
Lemma 5.7**.**
For any real numbers α,β,γ,τ which satisfy
0<γ<1,
τ≥2,
β>1/τ and α<min{ττ−1(βτ)−τ−11,ττ−1γ},
if the
sequence (xn)n∈N0 satisfies the conditions
[TABLE]
*then
xn≤γ for all
and all n≥0.
*
Proof 5.8**.**
*This follows by elementary calculations, as in [8, Lemma 6.15].
*
Using [12, Theorem 7.1], we obtain for the iteration matrix of the Algorithm 2 the recursive (in the level) form
[TABLE]
Again, we define the corresponding operator via Cℓ:=σℓ∗Cℓ(σℓ∗)−1.
Theorem 5.9**.**
For every γ∈(0,1), there is a ν⋆:=argminν∈N{ν∈N:CProp.\refprop:boundtwogridopg(ν)≤min{ττ−1(βThm.\refthm:mainτ)−τ−11,ττ−1γ}}. For all ν1≥ν⋆ we have
[TABLE]
Proof 5.10**.**
Here, we follow basically [12, Poof of Theorem 7.20]. Let v∈Rnℓ arbitrary.
For v=σℓ∗v, we obtain for ℓ∈N,
The last factor can be bounded by writing PΞℓ−1=id−(id−PΞℓ−1) followed by the triangle inequality.
Lemma 4.1 bounds ∥Sℓν1∥L2(M)→L2(M),
while Proposition 5.3 (with ν2=0) bounds
∥(id−PΞℓ−1)Sℓν1∥L2(M)→L2(M).
Thus, we end up with a bound
[TABLE]
which has the form required by Lemma 5.7,
with xℓ=∥Cℓ∥L2→L2,
βThm.\refthm:main:=C
and α=∥CTGℓ∥L2(M)→L2(M).
The condition ν1≥ν⋆ ensures the bound α≤min{ττ−1(βThm.\refthm:mainτ)−τ−11,ττ−1γ}}.
Thus
At the finest level, the kernel-based Galerkin problem ALx=bL, can be solved stably to any precision ϵmax,
by iterating the contraction matrix CLτ .
Select γ<1 and fix ν1 so that Theorem 5.9 holds.
Letting u(k+1)=MGML(τ)(u(k),bℓ) gives
∥u∗−u(k)∥ℓ2≤γk∥u∗−u(0)∥ℓ2.
If k is the least integer satisfying γk∥u∗−u(0)∥<ϵmax,
then k∼logγ1log(∥u∗−u(0)∥ϵmax).
*We note that
∥u∗−u(k)∥AL∼∥σL∗u∗−σL∗u(k)∥W21≤CBhd/2−1∥u∗−u(k)∥ℓ2,
and since d≥2, achieving ∥u∗−u(k)∥AL<ϵmax also requires only a fixed number of iterations. This shows Eq. 42.
*
6 The perturbed multigrid method
In this section, we consider a modified problem
[TABLE]
where AˇL is close to AL.
The perturbed multigrid method will produce an approximate solution uˇL(k) to uL⋆ which satisfies
∥uˇL(k)−uL⋆∥≤∥uˇL(k)−uˇL⋆∥+∥AˇL−1−AL−1∥∥bL∥.
Thus, for the true solution u to Eq. 8
and the Galerkin solution uL⋆=PΞLu=σL∗uL⋆, we have
[TABLE]
which can be made as close to ∥(1−PΞL)u∥L2 as desired by controlling the perturbation ∥AˇL−AL∥2→2
and the error from the multi-grid approximation ∥uˇL(k)−uˇL⋆∥.
Such systems may occur for a number of reasons: using localized Lagrange basis functions (as in [11]),
truncating a series expansion of the kernel (as in [1]),
or by using quadrature to approximate the stiffness matrix (both [11] and [1]).
In the next section, we will apply this by truncating the original stiffness matrix to employ only banded matrices and thereby
enjoy a computational speed up.
Perturbed multigrid method
The perturbed multigrid method replaces matrices Aℓ, pℓ and rℓ appearing in Algorithms 1 and 2 with
matrices Aˇℓ, pˇℓ and rˇℓ
We assume that for each ℓ there exists 0<ϵℓ so that
[TABLE]
In this set up ϵℓ may change per level.333Which could be the case, e.g., if Aˇℓ involved a Ξℓ dependent
quadrature scheme, or was obtained by bandlimiting (as we will do in the next section)
We assume ϵℓ is sufficiently small that
[TABLE]
It then follows from standard arguments that
[TABLE]
Because of the entry-wise error ∣(Aˇℓ)ξ,ξ−(Aℓ)ξ,ξ∣≤ϵℓ,
we also have that the diagonal matrix Bˇℓ=diag(Aˇℓ)
satisfies
[TABLE]
Therefore, there is θ so that for all ℓ and all x∈Rnℓ,
θ⟨Aˇℓx,x⟩≤⟨Bˇℓx,x⟩ holds.
This permits us to consider the Jacobi iteration applied to the perturbed linear system
Aˇℓxℓ=b,
which yields Jˇℓ:Rnℓ×Rnℓ→Rnℓ defined by
[TABLE]
Since \boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}=\theta\Bigl{(}\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}(\boldsymbol{\mathbf{\check{A}}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell})+(\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}-\boldsymbol{\mathbf{B}}_{\ell}^{-1})\boldsymbol{\mathbf{A}}_{\ell}\Bigr{)}, we can estimate the error between smoothing matrices as
[TABLE]
Because θ⟨Aˇℓx,x⟩≤⟨Bˇℓx,x⟩ it follows
that for all n,
[TABLE]
This also yields the following Lemma.
Lemma 6.1**.**
For M∈N, we get the bound
[TABLE]
Proof 6.2**.**
By telescoping, we have SℓM−SˇℓM=∑j=0M−1SℓM−1−j(Sℓ−Sˇℓ)Sˇℓj.
The inequality
[TABLE]
*follows from norm properties, and the result follows by applying Eq. 33 and Lemma 4.1.
*
This lemma can be combined with the estimate Eq. 32
to obtain
[TABLE]
Perturbed two grid method
Now, we consider the perturbed version of the two step algorithm.
We aim to apply the two-grid method with only truncated matrices to the problem
[TABLE]
Applying the two-grid method
to
Eq. 35, we obtain
uˇℓ⋆−uˇℓnew=CˇTGℓ(uˇℓ⋆−uˇℓold),
where the two grid iteration matrix is
[TABLE]
Lemma 6.3**.**
If ϵℓ≤hℓd+2 holds for all ℓ≤L,
then
[TABLE]
Remark 6.4**.**
A basic idea, used throughout this section, is the following result: If max(∥Mj∥,∥Mˇj∥)≤Cj, then
*Because ∥id−pˇℓAˇℓ−1−1rˇℓAˇℓ∥≤Chℓ−2 and ∥id−pˇℓAˇℓ−1−1rˇℓAˇℓ∥≤Chℓ−2,
the lemma follows by using Remark 6.4
with Eqs. 32 and 34, Lemma 6.1, and the above estimate of ∥E1∥.
*
Perturbed τ-cycle
As in the two-step case we consider the multigrid method also for the truncated system in
Eq. 35, i.e., Aˇℓuˇℓ⋆=bℓ=σℓuℓ⋆.
The multi-grid iteration matrix is
[TABLE]
From this we define the operator Cˇℓ:=σℓ∗Cˇℓ(σℓ∗)−1.
Theorem 6.6**.**
For any 0<γ<1,
there exist constants C1, C2 and C4 and ν⋆∈N such that CProp.\refprop:boundtwogridopg(ν⋆)≤C4min{ττ−1(βThm.\refthm:mainτ)−τ−11,ττ−1γ}.
For ν1≥ν⋆ choose ϵℓ small enough such that \epsilon_{\ell}h_{\ell}^{-(d+2)}(h_{\ell}^{-4}+\nu_{1}+\nu_{2})<C^{-1}_{\text{Lem.\ref{lem:2step_pert}}}\min(C_{1},\gamma C_{2}) for 0≤ℓ≤L for all ℓ and if h_{0}\leq\left(C^{-1}_{\text{Lem.\ref{lem:2step_pert}}}\min(C_{1},\gamma C_{2})\right)^{-1/4}, then
[TABLE]
Proof 6.7**.**
As in the proof of Theorem 5.9, we make the estimate
[TABLE]
Then Eq. 33 ensures that
∥σℓ∗pˇℓ(σℓ−1∗)−1Cˇℓ−1τσℓ−1∗Aˇℓ−1−1rˇℓAˇℓ(σℓ∗)−1Sˇℓν1∥ controls the second expression.
Considering the difference
E:=σℓ∗pˇℓCˇℓ−1τAˇℓ−1−1rˇℓAˇℓ(σℓ∗)−1Sˇℓν1−σℓ∗pℓCˇℓ−1τAℓ−1−1rℓAℓ(σℓ∗)−1Sℓν1, Remark 6.4
gives
[TABLE]
Using the Riesz property, this gives
[TABLE]
As in the proof of Theorem 5.9, the last normed expression can be bounded as
[TABLE]
Because (hℓ−2+ν1)hℓ−dϵℓ is bounded by assumption,
it follows that
[TABLE]
As by assumption ϵℓ≤hℓd+2 (no constants due to h≤h0) holds for all ℓ≤L, we
obtain by Lemma 6.3
[TABLE]
We use Theorem 5.9 and choose a natural number ν1⋆ large enough such that the inequality CProp.\refprop:boundtwogridopg(ν1⋆)≤C4min{ττ−1(βThm.\refthm:mainτ)−τ−11,ττ−1γ} is satisfied.
Thus, we obtain
[TABLE]
We have
[TABLE]
Thus, we define
[TABLE]
*Hence Lemma 5.7 applies and the result follows.
*
7 Truncated multigrid method
In this section we consider truncating the stiffness, prolongation and restriction matrices in order to improve the computational
complexity of the method. Each such matrix has stationary, exponential off-diagonal decay, so by retaining the (ξ,η) entry
when dist(ξ,η)≤Khℓ∣loghℓ∣, and setting the rest to zero, guarantees a small perturbation error (on the order
of O(hℓJ), where J∝K). This is made precise in Lemmas 7.3 and 7.5 below, with
the aid of the following lemma.
Lemma 7.1**.**
Suppose Ξ⊂M, c>0, and r≥2q(Ξ). Then for any η∈M, we have
[TABLE]
Proof 7.2**.**
The underlying set can be decomposed as {ξ∈Ξ∣dist(ξ,η)≥r}=⋃j=0∞Aj,
where Aj={ξ∈Ξ∣r+jq≤dist(ξ,η)<r+(j+1)q} has cardinality
\#\mathcal{A}_{j}\leq\frac{\alpha_{\mathbb{M}}}{\beta_{\mathbb{M}}}\bigl{(}\frac{r}{q}+(j+2)\bigr{)}^{d}.
It follows that
[TABLE]
*and the lemma follows from the fact that qr+j+2≤qr(j+2) for all j≥0.
*
Truncated stiffness matrix
The exponential decay in Lemma 3.1 motivates the truncation of the stiffness matrix, see e.g. [11, Eq. (8.1)]
We define for positive K, the truncation parameter rΞ:=Kh∣log(h)∣
[TABLE]
We note that AˇΞ;rΞ is symmetric if AΞ is symmetric.
By construction and quasi-uniformity, we obtain
#{ξ∈B(η,rΞ)∩Ξ}≤ρdhΞ,M−dαMβM(rΞ+q)d≤2αMβMρd∣Klogh∣d.
By h−d≤C∣Ξ∣, this yields
#{ξ∈Ξ∣(AˇΞ;rΞ)ξ,η=0}≤2αMβMρdddKd(log∣Ξ∣)d.
In particular, we obtain
[TABLE]
for the number of operations for a matrix vector multiplication with the truncated stiffness matrix, with
Ccomp=2αMβMρddd.
Lemma 7.3**.**
Define the global parameter Ctr:=αMβM∑n=1∞(n+2)de−ρνn.
Then the estimate
[TABLE]
*holds.
*
Proof 7.4**.**
The proof for the first statement is essentially given in [11, Prop 8.1].
Using Lemma 3.1, we observe, by symmetry, that
∥AΞ−AˇΞ;rΞ∥p→p is controlled by the maximum of the ℓ1 and ℓ∞ matrix norms,
which can be controlled by row and column sums.
This leads to off-diagonal sums
maxη∈Ξ∑ξ∈Ξ∩B∁(η,rΞ)∣(AΞ)ξ,η∣
and maxξ∈Ξ∑η∈Ξ∩B∁(η,rΞ)∣(AΞ)ξ,η∣.Lemma 7.1 with r=rΞ=Kh∣logh∣ and c=2hν yields
[TABLE]
Truncated prolongation and restriction matrices
We introduce truncated prolongation matrices
[TABLE]
where we use the notation rℓ:=rΞℓ−1.
Likewise, we define rˇℓ;rℓ:=(pˇℓ;rℓ)T.
For the numerical costs, we obtain
[TABLE]
where we use that
∣Ξℓ∣∼hℓ−d=γdhℓ−1d∼∣Ξℓ−1∣∼γd∣Ξℓ−1∣ due to Eq. 23.
Lemma 7.5**.**
We have
[TABLE]
Proof 7.6**.**
We proceed as in the proof of Eq. 39.
We can estimate row and column sums of pℓ−pˇℓ;rℓ by Eq. 13, obtaining
pℓ−pˇℓ;rℓ∞→∞≤Cpw∑ξ∈Ξℓ−1∩B∁(η,rℓ)e−νhℓ−1dist(η,ξ),
so, by Lemma 7.1
with r=rℓ=Khℓ∣loghℓ∣ and c=hℓ−1ν, we have
, we have
[TABLE]
Likewise,
pℓ−pˇℓ;rℓ1→1≤Cpw∑η∈Ξℓ∩B∁(ξ,rℓ)e−νhℓ−1dist(η,ξ).
Lemma 7.1 yields this time
with r=rℓ=Khℓ∣loghℓ∣ and c=hℓ−1ν, gives
[TABLE]
Thus, we get
[TABLE]
*Interpolation finishes the proof.
*
Truncated τ-cycle
We now consider the multigrid method using truncated stiffness, prolongation and restriction matrices.
We denote this by MGMTRUNCℓ(τ), and use it to solve
Eq. 35 with Aˇℓ=Aˇℓ;rℓ.
Lemmas 7.3 and 7.5 show that conditions for Theorem 6.6 are satisfied when K is chosen sufficiently large.
Theorem 7.7**.**
If τγd<1, we obtain
[TABLE]
Proof 7.8**.**
Define the floating point operation count for the truncated multi-grid method by
Mℓ:=FLOPS(x↦MGMTRUNCℓ(τ)(x,bℓ)).
By estimates Eqs. 38 and 40,
Pℓ:=FLOPS(x↦pˇℓ;rℓx),
Rℓ:=FLOPS(x↦rˇℓ;rℓx),
and
Aℓ:=FLOPS(x↦Aˇℓ;rℓx)
are each bounded by CKdlog(∣Ξℓ∣)d∣Ξℓ∣.
Because each Jacobi iteration involves multiplication by a matrix with the same number of nonzero entries, we note
that
[TABLE]
as well.
From this, we have the recursive formula
[TABLE]
Applying Eq. 38 and Eq. 40 gives
Mℓ≤CKd(ν1+ν2+3)(∣loghℓ∣dhℓ−d)+τMℓ−1.
By setting wℓ=(∣loghℓ∣/hℓ)d and C~:=CKd(ν1+ν2+3), we have the recurrence
[TABLE]
Note that
wℓ≤(∣logh0∣+ℓ∣logγ∣)dγ−dℓh0−d,
since hℓ≤γℓh0.
By Hölder’s inequality, we have
(∣logh0∣+ℓ∣logγ∣)d≤2dd−1(∣logh0∣d+ℓ∣logγ∣d),
which
provides the estimate
wℓ−k≤2dd−1h0−dγ−dℓγdk(∣logh0∣d+ℓ(1−k/ℓ)∣logγ∣d).
Applying this to the above estimate for Mℓ gives,
[TABLE]
*The result follows by taking (γℓh0)−d≤Ch−d and (∣logh0∣+ℓ∣logγ∣)∼∣logh∣.
*
Remark 7.9**.**
The kernel-based Galerkin problem AˇL;rLuˇL⋆=bL, can be solved stably to any precision ϵmax,
by iterating the algorithm MGMTRUNCL(τ)(uˇL,bL), i.e., the truncated multi-grid with τ≥2 cycle.
Select γ<1 and fix ν1 so that Theorem 6.6 holds.
Let uˇL(k+1)=MGMTRUNCL(τ)(uˇL(k),bL).
If k is the least integer satisfying γk∥uˇL⋆−uˇL(0)∥ℓ2<ϵmax,
then
[TABLE]
Due to Theorem 7.7, we obtain an overall complexity of
[TABLE]
*We note that
∥uˇL⋆−uˇL(k)∥AL∼∥σL∗(uˇL⋆−uˇL(k))∥W21≤CBhd/2−1∥uˇL⋆−uˇL(k)∥ℓ2,
and since d≥2, achieving ∥uˇL⋆−uˇL(k)∥AL<ϵmax also requires only a fixed number of iterations. This is the statement of Eq. 42.
*
Indeed, using k
steps of the Conjugate Gradient method on the original system Eq. 1, would give error
∥uˉL(k)−uL⋆∥AL≤(CNL2/d+1CNL2/d−1)k∥uL⋆−u(0)∥AL.
Thus to ensure a tolerance of ∥uˉL(k)−uL⋆∥AL≤ϵmax, one would need
[TABLE]
steps, where we use (CNL2/d+1CNL2/d−1)∼(1−C~N−2/d).
In contrast to this, the multigrid W-cycle requires only
[TABLE]
iterations to achieve error
∥uˇL(k)−uˇL⋆∥AL≤ϵmax.
In fact, it reaches ∣uˇL(k)−uˇL⋆∥ℓ2≤ϵmax, which is a stronger constraint, within k iterations.
In particular, the number of iterations is independent of the size NL of the problem.
Bibliography16
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] P. Collins , Kernel-Based Galerkin Methods on Compact Manifolds Without Boundary, with an Emphasis on SO(3) , Ph D thesis, University of Hawai’i at Manoa, available at https://scholarspace.manoa.hawaii.edu/items/936c 034f-5039-4ef 0-8f 03-03f 21d 71184 b , 2021.
2[2] G. Dziuk and C. M. Elliott , Finite element methods for surface PD Es , Acta Numer., 22 (2013), pp. 289–396, https://doi.org/10.1017/S 0962492913000056 . · doi ↗
3[3] E. Fuselier, T. Hangelbroek, F. J. Narcowich, J. D. Ward, and G. B. Wright , Localized bases for kernel spaces on the unit sphere , SIAM J. Numer. Anal., 51 (2013), pp. 2538–2562, https://doi.org/10.1137/120876940 . · doi ↗
4[4] T. Hangelbroek, F. J. Narcowich, C. Rieger, and J. D. Ward , Direct and inverse results on bounded domains for meshless methods via localized bases on manifolds , in Contemporary computational mathematics—a celebration of the 80th birthday of Ian Sloan. Vol. 1, 2, J. Dick, F. Kuo, and H. Woźniakowski, eds., Springer, Cham, 2018, pp. 517–543, https://doi.org/10.1007/978-3-319-72456-0_24 . · doi ↗
5[5] T. Hangelbroek, F. J. Narcowich, X. Sun, and J. D. Ward , Kernel approximation on manifolds II: the L ∞ subscript 𝐿 L_{\infty} norm of the L 2 subscript 𝐿 2 L_{2} projector , SIAM J. Math. Anal., 43 (2011), pp. 662–684, https://doi.org/10.1137/100795334 . · doi ↗
6[6] T. Hangelbroek, F. J. Narcowich, and J. D. Ward , Kernel approximation on manifolds I: bounding the Lebesgue constant , SIAM J. Math. Anal., 42 (2010), pp. 1732–1760, https://doi.org/10.1137/090769570 . · doi ↗
7[7] T. Hangelbroek, F. J. Narcowich, and J. D. Ward , Polyharmonic and related kernels on manifolds: interpolation and approximation , Found. Comput. Math., 12 (2012), pp. 625–670, https://doi.org/10.1007/s 10208-011-9113-5 . · doi ↗
8[8] V. John , Multigrid methods , tech. report, Lecture Notes available at https://www.wias-berlin.de/people/john/LEHRE/MULTIGRID/multigrid.pdf , 2014.