Asymptotic equivalence of non-parametric regression with spherical regressors and Gaussian white noise
Martin Kroll

TL;DR
This paper proves that non-parametric regression on spherical data with different sampling designs becomes statistically equivalent to Gaussian white noise models as sample size grows, under certain smoothness conditions.
Contribution
It establishes the asymptotic equivalence of regression experiments with spherical designs and Gaussian white noise, extending the Le Cam theory to spherical regressors.
Findings
Regression experiments are asymptotically equivalent to Gaussian white noise models.
Equivalence holds over spherical Sobolev and Besov balls under specified conditions.
Sharpness of smoothness assumptions is demonstrated through non-equivalence results.
Abstract
We study the asymptotic behaviour of both spherical -designs and random uniform designs as the set of sampling points in non-parametric regression with spherical regressors of arbitrary dimension. We show that the corresponding regression experiments are asymptotically equivalent, in the sense of Le Cam, to the same sequence of Gaussian white noise experiments as the sample size tends to infinity. More precisely, global asymptotic equivalence is established over spherical Sobolev balls (for both the fixed and the random uniform design case) and over spherical Besov balls (for the fixed design case). Matching non-equivalence results demonstrate that the imposed smoothness assumptions are essentially sharp.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Asymptotic equivalence of non-parametric regression with spherical regressors
and Gaussian white noise
Martin Kroll
Universität Bayreuth
Abstract
We study the asymptotic behaviour of both spherical -designs and random uniform designs as the set of sampling points in non-parametric regression with spherical regressors of arbitrary dimension. We show that the corresponding regression experiments are asymptotically equivalent, in the sense of Le Cam, to the same sequence of Gaussian white noise experiments as the sample size tends to infinity. More precisely, global asymptotic equivalence is established over spherical Sobolev balls (for both the fixed and the random uniform design case) and over spherical Besov balls (for the fixed design case). Matching non-equivalence results demonstrate that the imposed smoothness assumptions are essentially sharp.
Keywords: non-parametric regression, spherical -designs, random uniform design, Gaussian white noise, Le Cam distance, asymptotic equivalence of experiments
1 Introduction
Regression models with spherical regressors of arbitrary dimension and real-valued responses occur in a wide variety of scientific disciplines. Instances of such models appear in the Earth sciences, where physical processes on the Earth (which is approximately a two-dimensional sphere with radius km) are considered. Specific examples are the global temperature field [OL04] and the Earth’s magnetic field [HCM03]. Other applications involving functions on the two-dimensional sphere can be found in biology, for instance, for the purpose of cell-shape modeling [RM18], or in texture analysis [SB03]. The special case of functions defined on the three-dimensional sphere arises in crystallography, where it can be used to describe probability distributions of crystalline orientations [MS08], or in medical imaging [Hos+13]. Besides, spherical harmonic expansions find various applications in quantum theory [Ave93].
Motivated by its wide range of applications, we consider the non-parametric regression model with observations obeying the model equations
[TABLE]
where is the unknown regression function defined on the -dimensional sphere , is the finite set of (deterministic or random) sampling points (following the standard convention, we will from now on denote deterministic sampling points with lowercase and random sampling points with uppercase letters), and are i.i.d. standard Gaussian random variables independent of the design. The noise level is assumed to be known. Non-parametric estimators of the regression function in model (1.1) have already been studied: [Wah81] and [ANS96] consider spline interpolation and smoothing on the sphere. Local polynomial smoothing for circular data is treated in [DMPT09]. Series estimators in terms of spherical harmonics as well as wavelet like series estimators are studied in [NPW06a, Wia+08, Mon11]. Let us also mention that, in the context of density estimation with spherical data, kernel methods [HWC87], Fourier expansions [Hen90], and methods based on spherical needlets [Bal+09] have already been considered.
The theoretical analysis of the regression model (1.1), notwithstanding its practical relevance, is hindered by the discrete nature of the model due to the measurements taken at isolated sampling points. Consequently, one might be tempted to replace the model (1.1) by a continuous Gaussian white noise model,
[TABLE]
where is a suitable noise level, depending on both the noise level and the sample size in the discrete model (1.1), is the normalized surface area measure on and a standard Gaussian white noise process on . Indeed, model (1.2) allows for a rigorous and neat mathematical analysis, which is conducted in [Kle99] where the sharp asymptotic minimax risk for different function classes and loss functions is derived. To the best of our knowledge, no theoretical study has yet addressed the relationship between the more realistic observation model (1.1) and the more tractable model (1.2). The present paper aims to close this gap.
More precisely, the main purpose of the present work is to state (essentially sharp) conditions on the set of sampling points and the class of admissible regression functions that allow to replace the discrete model (1.1) with the continuous surrogate (1.2) (or vice versa). For this, we rely on Le Cam’s theory of asymptotic equivalence of experiments. Within this theory, the discrepancy between statistical experiments and sharing the same parameter space is quantified using a pseudo-metric , commonly referred to as the Le Cam distance. Two sequences and of statistical experiments having the same parameter space are said to be asymptotically equivalent, in the sense of Le Cam, if
[TABLE]
From an inferential point of view, asymptotically equivalent experiments are equally informative in the limit. For a more comprehensive account of asymptotic equivalence theory we refer the reader to Chapter 1 of [GN16], as well as the survey papers [Nus04] and [Mar16].
Since the seminal paper [BL96], asymptotic equivalence of many non-parametric experiments has been established. The articles [BL96, Roh04, Rei08] consider fixed design regression on the unit interval with equidistant sampling points, whereas [Bro+02] deals with the random design case. [Rei08] extends the results from [BL96] and [Roh04] also to the multivariate and random design case. In addition, the papers [BL96] and [Rei08] discuss minor deviations from the assumption of equidistant sampling points. [GN02] considers asymptotic equivalence for non-parametric regression with centered, but non-Gaussian, noise, whereas [MR13] considers non-regular errors. The contributions [Car09], [SH14], and [DK22] weaken the i.i.d. assumption on the noise. The paper [CZ09] develops asymptotic equivalence theory for robust non-parametric regression. The limits of asymptotic equivalence theory are discussed in [BZ98] and [ES96], respectively, by providing examples of asymptotically non-equivalent experiments. However, all the papers cited so far discuss asymptotic equivalence for regression experiments and a corresponding Gaussian white noise model only when the regression domain is a subset of some Euclidean space (admittedly, the case of periodic regression functions on , considered also in [Rei08], can be interpreted as a first step to a manifold setup due to the topological identification of and the -dimensional torus ).
Concerning the choice of design points, the asymptotic equivalence results cited above are obtained under the assumption that the sampling points are evenly spread over the whole regression domain. In the deterministic design case this is achieved by choosing sampling points forming a regular grid, in the random design case the natural approach is to sample from the uniform distribution on the regression domain. In the following, we briefly review these two cases which will be considered in the main part of the paper.
In the one-dimensional case considered in [BL96] and [Roh04], the target parameter is a function defined on the unit interval, and the canonical deterministic design that yields asymptotic equivalence with the corresponding Gaussian white noise model is the equidistant grid with sampling points for . In the multivariate setup, studied in [Rei08], the regression domain is given by the -dimensional unit cube and a regular grid of sampling points of the form with and is assumed. For more complicated regression domains, possibly subsets of manifolds, a comparable notion of evenly spread deterministic point sets is not evident. For the special case of spheres, various measures to evaluate the distributional properties of finite point sets exist [BG15]. Besides its intrinsic mathematical motivation, the problem of finding evenly spread points on spheres is of fundamental relevance in fields like viral morphology, crystallography, molecular structure, and electrostatics. We refer the reader to [SK97] for further discussion.
In the following, we will build on the notion of spherical -designs as originally introduced in [DGS77]. A finite, non-empty set is called a spherical -design if the identity
[TABLE]
holds for all polynomials of total degree in variables. Since the work of [SZ84], it is known that spherical -designs exist for all combinations of and , provided that is sufficiently large. In [BRV13] it has finally been proven that spherical -designs in exist for all with a numerical constant depending only on the dimension of the sphere. This result is essentially optimal since for the minimal number of points forming a spherical -design the estimate has already been established in [DGS77].
The notion of spherical -designs, originally introduced in the field of algebraic combinatorics, has connections to various fields of mathematics and we refer to the survey article [BB09] for a comprehensive overview. In the area of numerical analysis, spherical -designs have already attracted some interest, for instance, as cubature points for numerical integration of functions on spheres [HSW10]. Spherical -designs of small cardinality and low dimension are explicitly known and correspond to highly symmetrical point configurations. For instance, spherical -designs on the two-dimensional sphere are obtained as the vertex sets of the regular tetrahedron (for ), the cube and the regular octahedron (both for ), the regular dodecahedron and the regular icosahedron (both for ). For larger values of , spherical -designs possessing good geometric properties are also known to exist [Wom18]. Figure 1 illustrates spherical -designs on the two-dimensional sphere for two larger values of , highlighting their excellent distributional properties.
In view of these properties, spherical -designs seem to be a promising choice of sampling points in (non-parametric) regression with spherical regressors of arbitrary dimension. This heuristic will be supported in Section 3 by means of appropriate asymptotic equivalence results. The first main result of this paper provides an upper bound on the Le Cam distance between the models (1.1) and (1.2) for a general function class when the fixed sampling points form a spherical -design. In the sequel, this general result is applied to two special cases: First, we prove global asymptotic equivalence for regression functions from spherical Sobolev balls of smoothness . Sobolev spaces on spheres can either be defined in terms of charts or via the decay of coefficients in spherical harmonic expansions. We rely on the latter characterization which is sufficient for our purposes. As a second application, which extends the first one, we consider spherical Besov spaces which can be defined in terms of a function’s coefficients with respect to a set of spherical needlets. For this more general setup, we derive asymptotic equivalence between the models (1.1) and (1.2) under the assumption . This condition and the analogous condition in the Sobolev case guarantee that the considered function spaces can be continuously embedded in the Banach space of continuous functions on the sphere.
Proving global asymptotic equivalence over Sobolev and Besov balls builds on results from numerical analysis concerning the approximation by a so-called hyperinterpolation [Slo95] which, in statistical terminology, coincides with the least-squares estimator in certain cases. The hyperinterpolation of the regression function is used to define an intermediate experiment between the experiments defined by (1.1) and (1.2), respectively. Since the sample size must in general be chosen strictly larger than the model dimension of the intermediate experiment, the hyperinterpolation does usually not interpolate the regression function at the design points. In contrast, in the cited papers for the Euclidean setting, for instance [Rei08], asymptotic equivalence is proven by means of an intermediate experiment defined in terms of a suitable interpolation of the regression function. Consequently, in the Euclidean case, the intermediate experiment and the regression experiment (1.1) are even non-asymptotically equivalent. In the spherical case, equivalence in the sense of Le Cam between the intermediate experiment and the regression experiment holds only asymptotically and additional estimates are necessary. In the general bound on the Le Cam distance between the experiments (1.1) and (1.2), this leads to an extra term that is not present in the Euclidean setting considered in [Rei08].
In the second part of the paper, we will consider the regression model (1.1) where the design points are i.i.d. according to the uniform distribution , that is, the normalized surface area measure on the sphere. This model is more realistic in many applications where sampling points cannot be chosen by the experimenter but are themselves random. Two instances of random uniform designs, with sample sizes equal to the ones in Figure 1, are shown in Figure 2 for the case .
A comparison of Figures 1 and 2 suggests that a typical realization of a random design is much less regular than a spherical -design with good geometric properties (there exist both data voids and exceptionally close sampling points within the random design). Furthermore, the exact cubature formula (1.3) for spherical -designs holds only in expectation for the random uniform design case which makes the analysis much more involved. This has already been noticed in [Bro+02] and [Rei08] where a separate investigation of low- and high-frequency coefficients of the regression function was necessary in order to establish asymptotic equivalence with a Gaussian white noise model analogous to (1.2). This separate treatment of low- and high-frequency coefficients appears also in our analysis, which in its overall structure follows the one from [Rei08]. However, some special properties related to the underlying spherical geometry are of importance and additional tools from numerical analysis and representation theory will be used. Using these tools, asymptotic equivalence of random uniform design regression and the Gaussian white noise model is proven over Sobolev balls on the sphere. As in the fixed design case, asymptotic equivalence holds for . The non-equivalence of both fixed and random design regression and Gaussian white noise in the regime for Sobolev balls (or for Besov balls) is established in Section 5.
The rest of the paper is organized as follows. In the preliminary Section 2, we provide some background on spherical harmonic expansions, asymptotic equivalence theory, and the Gaussian white noise model. In Section 3.1 we derive the announced general bound on the Le Cam distance between the regression experiment defined by (1.1) on a spherical -design and the Gaussian white noise model (1.2). This bound is then used to establish asymptotic equivalence between these models when the target functional parameter belongs to a Sobolev (Section 3.2.1) or Besov (Section 3.2.2) ball. Section 4 deals with asymptotic equivalence of random uniform design regression and Gaussian white noise over Sobolev balls. In Section 5, we prove matching non-equivalence results showing that the smoothness assumptions from Sections 3 and 4 cannot be improved. In Section 6, we conclude with a brief discussion indicating some connections to optimal design theory and open problems. All proofs are deferred to Section 7.
Notation
For sequences , we write if there exists a universal (and irrelevant) numerical constant such that holds. The notion is defined analogously and means that both and hold simultaneously. Throughout, we denote the identity operator on a space by , and the identity matrix by . A norm without subindex refers to the usual Euclidean norm (where the dimension of the Euclidean space is suppressed in the notation).
2 Preliminaries
2.1 Spherical harmonics
Let be the normalized surface area measure on the -dimensional sphere in -dimensional ambient Euclidean space. We denote with
[TABLE]
the Hilbert space of (equivalence classes of) square-integrable, real-valued functions on . More generally, for any , we consider the Banach spaces
[TABLE]
The Laplace-Beltrami operator on gives rise to the decomposition
[TABLE]
where , , denote the eigenspaces associated with the increasing sequence of non-negative eigenvalues of . It is known that and that the dimension of equals
[TABLE]
For any , we choose an orthonormal basis of consisting of real-valued eigenfunctions of associated with the eigenvalue . Any such set of orthonormal basis functions is referred to as spherical harmonics of degree . Any can be represented as an infinite series
[TABLE]
with coefficients
[TABLE]
For convenience, we suppress the dependence of the spherical harmonics and the corresponding coefficients on in our notation. On occasion, it will turn out convenient to switch from the double index notation to a single index. This is achieved by a one-to-one enumeration function
[TABLE]
for which we additionally assume that
[TABLE]
In this case, we set and if . For instance, the representation (2.1) can alternatively written as
[TABLE]
We denote with
[TABLE]
the finite-dimensional space spanned by all spherical harmonics up to degree which has dimension
[TABLE]
We write for the -orthogonal projection of onto some subspace of . In particular, for in (2.1) we have
[TABLE]
In the proofs of Section 7, we will on several occasions exploit the addition formula
[TABLE]
for spherical harmonics (see [DX13], Lemma 1.2.3 and Corollary 1.2.7).
2.2 Asymptotic equivalence of statistical experiments
We briefly recall the notion of Le Cam distance between statistical experiments and collect some properties that are essential for the remainder of the paper. Let and be two statistical experiments sharing the same parameter space . Recall that a Markov kernel induces a map that transports probability measures from to . Given a probability measure on , the probability measure on is defined by
[TABLE]
The Le Cam distance between and is defined as the symmetrized quantity
[TABLE]
where the deficiency is defined by
[TABLE]
The infimum in this definition is taken over all Markov kernels and denotes the total variation distance. If , the experiments and are said to be (exactly) equivalent in the sense of Le Cam. More generally, two sequences and of statistical experiments having the same parameter space are said to be asymptotically equivalent if
[TABLE]
2.3 Gaussian white noise
Consider the Gaussian white noise model with continuous observation
[TABLE]
where , , and is standard Gaussian white noise on the -dimensional sphere. The stochastic differential equation (2.4) can be interpreted in a distributional sense as follows: (2.4) is equivalent to observing a Gaussian process , indexed by the set of test functions, which is defined by
[TABLE]
for . Here, the white noise part
[TABLE]
is a centered Gaussian process with covariance structure
[TABLE]
Evaluating the process at an orthonormal basis of shows that observing (2.4) is equivalent to the infinite-dimensional Gaussian sequence model
[TABLE]
where is the sequence of Fourier coefficients (with respect to the basis ) and is a sequence of independent standard Gaussian random variables. Let denote the distribution of the process in (2.4). The following expression for the total variation distance between and is derived in [Car06], Section 3.2, and will be used frequently in the proofs of Section 7:
[TABLE]
where denotes the distribution function of a standard Gaussian random variable.
3 Regression on spherical -designs
For a class of functions and a finite, non-empty set of fixed sampling points, we denote by the regression experiment with observations
[TABLE]
where is the unknown regression function, are i.i.d. standard Gaussian, and . We assume that the standard deviation of the additive noise is known and suppress the dependence of the experiment on in the notation. Similarly, we denote by the Gaussian white noise experiment defined via Equation (1.2) in the introduction with , that is, given by
[TABLE]
3.1 A general bound for the Le Cam distance
In order to state a general bound on the Le Cam distance , we introduce some further notation. Given a finite set of approximating functions in , not necessarily forming an orthonormal basis of their span , we consider the finite-dimensional approximation
[TABLE]
of a function where
[TABLE]
The empirical counterparts of the coefficients are obtained by the corresponding equal-weight cubature rules at the sampling points ,
[TABLE]
Since spherical harmonics are restrictions of polynomials in variables to the unit sphere with the index indicating the degree of the polynomial associated with (see [DX13], Chapter 1, for details), a spherical -design with yields an equal-weight cubature rule that is exact for functions in , that is,
[TABLE]
The semi-norm associated with the inner product is denoted by . Replacing in (3.3) with leads to the empirical approximation
[TABLE]
Note that while depends on via exact -inner products, the approximation is fully data-driven by means of the cubature formula (3.4). Associated with we introduce the intermediate experiment with continuous observation
[TABLE]
Note that the intermediate experiment , contrary to the Gaussian white noise experiment , depends on the sampling points via the empirical coefficients . The following theorem states the announced upper bound on the Le Cam distance .
Theorem 3.1**.**
Consider the experiments , defined by (3.1), and , defined by (3.2), where the unknown parameter belongs to some class of functions defined on the -dimensional sphere . Assume that for some and that is a spherical -design for . Then, the Le Cam distance between the two experiments is bounded by
[TABLE]
where
[TABLE]
and
[TABLE]
Remark 3.2*.*
Let us briefly comment on the two terms appearing in the bound on derived in the theorem. The second term
[TABLE]
is not surprising as it already appears in known results for the multivariate Euclidean case studied before (see, for instance, the bound stated in Theorem 2.4 of [Rei08]). In contrast, the first term
[TABLE]
is new. In the setting of Theorem 3.1, the approximant can in general not be chosen to interpolate at the sampling points . This is due to the non-existence, except for few special cases, of so-called tight spherical -designs. A more detailed discussion of this issue is given in Remark 3.6.
Remark 3.3*.*
The proof of Theorem 3.1 is constructive in the sense that it provides explicit Markov kernels for transferring observations between the two considered experiments. These kernels involve randomizations, which, however, require knowledge of the noise level . This assumption is common in the literature on asymptotic equivalence between non-parametric regression and Gaussian white noise experiments (see, for instance, [Bro+02, Roh04, Rei08]). Only few works address the more realistic case with an unknown noise variance, [Car07] being a notable contribution in this direction.
Remark 3.4*.*
From Theorem 3.1 it is easily derivable that global asymptotic equivalence follows if the class consists of functions that satisfy the smoothness condition
[TABLE]
with as . This smoothness assumption appears in the numerical analysis literature since it guarantees uniform convergence of the hyperinterpolation to [TVB19]. For instance, for band-limited functions, (3.7) is trivially fulfilled. More precisely, for the function class
[TABLE]
for some fixed , the considered regression model and the Gaussian white noise model are exactly equivalent, for resp. sufficiently large, to observing a multivariate Gaussian of dimension with unknown mean in and covariance matrix .
3.2 Results for specific function classes
In the following two sections, we apply the general Theorem 3.1 to two specific function classes: Sobolev balls (Section 3.2.1) and Besov balls (Section 3.2.2).
3.2.1 Asymptotic equivalence over Sobolev balls
For a smoothness parameter (not necessarily being an integer), we define the (fractional) Sobolev norm by
[TABLE]
where is the sequence of Fourier coefficients of the function defined via (2.2). The Sobolev space is defined as the set of functions such that . It is easy to verify that
[TABLE]
(replacing the -norm by another -norm can be used to define fractional Sobolev spaces for general ). The Fourier multiplier approach presented here to define fractional Sobolev spaces is sufficient for the purposes of this paper. This approach is equivalent to the one that defines Sobolev spaces in terms of charts [Tri86]. The first specific function class for which asymptotic equivalence results will be derived in the sequel are the Sobolev balls
[TABLE]
consisting of functions in with Sobolev norm bounded by .
In order to derive from Theorem 3.1 asymptotic equivalence of and over Sobolev balls, we take as the set of all spherical harmonics up to a certain resolution level , that is, we set if for the enumeration function introduced in Section 2.1. Consequently, . In this specific setup, assuming that the set of sampling points is a spherical -design with ensures that defined in (3.5) is the least-squares approximation of in based on the data , that is,
[TABLE]
In fact, this follows from the inclusion which implies that for the design matrix we have . Hence, the least-squares approximation is given by where satisfies
[TABLE]
From this, we obtain , showing that the least-squares approximation coincides with in this case. Before we devote ourselves to the application of Theorem 3.1 to Sobolev balls, we make two remarks.
Remark 3.5*.*
For the case of Sobolev balls, condition (3.7) leads to a stronger assumption on the interplay of the sample size , the resolution level , and the smoothness parameter than necessary. More precisely, we will consider below spherical -designs of cardinality which is the minimal order achievable [DGS77]. Then, global asymptotic equivalence can be achieved under the assumption . From (3.7), only the more restrictive condition can be obtained. In fact, this is a direct consequence of Equation (26) in [Kus00] which implies that
[TABLE]
This lower bound implies that the condition as , which is equivalent to , must be satisfied in order to guarantee that the term converges to zero as desired. The less restrictive condition , established below by finding an appropriate upper bound for the quantity , is also the one expected from the corresponding results in the Euclidean case [Rei08].
Remark 3.6*.*
Contrary to the approach in the Euclidean case considered in [Rei08], the least-squares approximation can in general not be chosen as an interpolation through the points . Lemma 3 in [Slo95] states that the classical interpolation formula
[TABLE]
holds for all continuous functions on the sphere if and only if the equal-weight cubature rule associated with the design , which is exact for , is also minimal in the sense that equals the dimension of (see [Slo95], p. 242, for this definition of minimality). Now, also from [Slo95], Section 4.1, we report that spherical -designs with this property (such spherical -designs are also referred to as tight) only exist in few special cases. More precisely, in [BD79] it is shown that tight spherical designs do not exist for and . Therefore, the sample size is strictly larger than in general and . Consequently, the term in Theorem 3.1 does not vanish. In the case considered in [Rei08], on the contrary, the term is not present. In that case, a set of equidistant design points (these points form a regular -gon on the unit circle ; see also Example 2.6 in [BB09]) defines a spherical -design for and the least-squares approximation interpolates on the spherical -design. In the multivariate case with regression domain equal to , which is also treated in [Rei08], a product design approach can be chosen which inherits the interpolation property from the case . In this sense, the term is a new ingredient appearing in the bound on the Le Cam distance due to the considered spherical framework of dimension . The term , as mentioned above, is standard. In contrast to the work [Rei08], where fine properties of the Fourier basis are used in order to bound this term, we will exploit recent results from [LW23] on approximation by (weighted) least squares polynomials on the sphere to bound (and also ).
Applying the bound on the Le Cam distance derived in Theorem 3.1 establishes asymptotic equivalence of the experiments and over Sobolev balls.
Theorem 3.7**.**
Assume that is a spherical -design with . Then, for with ,
[TABLE]
In particular, choosing as a spherical -design with and cardinality of the minimal possible order , yields
[TABLE]
and the experiments and are asymptotically equivalent as .
3.2.2 Asymptotic equivalence over Besov balls
The representation of a regression function, for instance in model (1.1), as a spherical harmonic expansion as in (2.1) suffers from the drawback that spherical harmonics are spread all over the sphere. This leads to poor local performance of regression estimates relying on truncated spherical harmonic expansions. In order to address this issue, the article [NPW06a] introduced a class of localized tight frames on spheres of arbitrary dimension, which are referred to as needlets due to their excellent localization properties. In Figure 3, this essential difference between spherical harmonics and needlets is illustrated by opposing typical heatmaps of a spherical harmonic and a spherical needlet, respectively.
We give a brief sketch of the needlets’ construction since our arguments in the following rely on some fine properties of the needlet expansion of a function. Our presentation here is mainly based on the papers [Bal+09] and [Wan+17].
The first ingredient in the definition of needlets is a continuous and compactly supported function which is referred to as a filter. We assume that satisfies
[TABLE]
For some of the properties stated below, the assumption is more restrictive than necessary. Given (3.9), (3.10) is equivalent to the partition of unity property for ,
[TABLE]
For a filter function and , we consider the filtered kernel defined by
[TABLE]
where denotes the standard inner product on and the normalized Gegenbauer polynomial defined by
[TABLE]
with the Jacobi polynomial of degree for .
The second ingredient in the construction of spherical needlets are cubature rules
[TABLE]
which are exact for functions , that is,
[TABLE]
The weights of these cubature rules are assumed to be positive but not necessarily equal (but in principle, the underlying cubature rules can itself be chosen as equal-weight spherical -designs as in Figure 3). Then, for any and , spherical needlets are, for , defined by
[TABLE]
or, equivalently, by
[TABLE]
Of course, this construction of needlet functions depends on the cubature points and their corresponding weights. However, it can be imposed that
[TABLE]
which will be assumed from now on. By construction, the are band-limited but not orthogonal. More precisely, is a polynomial of degree and it holds that
[TABLE]
Moreover, for any ,
[TABLE]
The needlet coefficients of a function are defined as
[TABLE]
and a function can be represented as
[TABLE]
By a slight abuse of notation, we denote the truncated needlet expansion including only the with by ,
[TABLE]
which can equivalently be written as
[TABLE]
where the new filter is defined in terms of the needlet filter as follows:
[TABLE]
Membership to spherical Besov spaces can be characterized by means of the needlet coefficients of a function . More precisely, belongs to the Besov space if and only if
[TABLE]
is referred to as the Besov norm. Putting
[TABLE]
we have the equivalence
[TABLE]
of norms (see [NPW06], Chapter 5). The Sobolev spaces introduced above coincide with the Besov spaces [Bal+09]. We denote by the ball of radius in the Besov space . We now apply the bound on the Le Cam distance derived in Theorem 3.1 to establish asymptotic equivalence of the experiments and over such Besov balls. Denote with the empirical needlet approximation defined by
[TABLE]
where
[TABLE]
for the set of sampling points. In the following, we assume that is a spherical -design for (this assumption is slightly stronger than the one in Theorem 3.1). Of course, is the empirical counterpart of obtained by replacing the -inner product with its empirical analogue relying on the equal-weight cubature rule associated with the spherical design . Equivalently, we can write
[TABLE]
see [Wan+17], Equations (36), (37), and (43). The maximal resolution level will be chosen such that
[TABLE]
which coincides, except for a missing logarithmic term, with the choice of this truncation parameter in adaptive non-parametric estimation using needlets [Bal+09, Mon11].
Theorem 3.8**.**
Assume that is a spherical -design with , . Then, for with ,
[TABLE]
In particular, choosing as a spherical -design with and cardinality of the minimal possible order , yields
[TABLE]
and the experiments and are asymptotically equivalent as .
4 Regression on random uniform designs
The aim of this section is to establish asymptotic equivalence between the Gaussian white noise model given by (3.2) and random uniform design regression with model equations
[TABLE]
where are i.i.d. but all the other quantities are defined as in the fixed design regression model defined by (3.1). In this section, we restrict ourselves to the smoothness class . The corresponding statistical experiment is denoted by and the following result states the asymptotic equivalence of and under the same smoothness assumption as in the fixed design case.
Theorem 4.1**.**
Let with . Consider the random design regression experiment defined by (4.1). For all sufficiently large , let be maximal such that
[TABLE]
where . Then, for any with , we have¸
[TABLE]
In particular, choosing and with such that in addition
[TABLE]
implies that
[TABLE]
and the experiments and are asymptotically equivalent as .
The proof of Theorem 4.1, which is given in Section 7.4, is (like the one of Theorem 3.1) constructive in the sense that it provides concrete Markov kernels that allow to transform observations between the considered experiments. The proof relies on separate calculations for low- and high-frequency coefficients and is similar to the proof of the analogous result in the multivariate Euclidean case given in [Rei08]. However, several additional technical tools (a generalization of the classical decomposition, results from representation theory like Schur’s lemma, reverse Hölder inequalities for spherical harmonics, a Taylor series expansion of the matrix valued function ) are necessary. The interaction of these tools is certainly of independent interest and might be useful to establish further extensions of the present result in future work.
5 Asymptotic non-equivalence
We now show that the asymptotic equivalence results of Sections 3 and 4 and are essentially optimal in the sense that the conditions (for Sobolev balls) and (for Besov balls) cannot further be weakened. For this purpose it will turn out sufficient to reduce the framework to the case where the common parameter space for both the regression and the Gaussian white noise experiment consists of two elements only. More precisely, we will construct subsets with such that the observations in the regression model become indistinguishable over whereas in the Gaussian white noise model the total variation distance between the two potential distributions is uniformly bounded (as a function of the sample size ) from below by a positive constant. From this, asymptotic non-equivalence between the two experiments can be followed by a standard argument. The construction of the hypotheses in the two experiments is inspired by the approach in [WW16] where the optimal recovery of smooth functions on spheres from function values was studied (especially, the ideas used in the proof of Lemma 3.5 of that reference turn out to be useful for the proof of the following result).
Theorem 5.1**.**
Consider with or with . Let be any (deterministic or random) point set with . Denote with the Gaussian white noise experiment given by (3.2) and with the regression experiment defined by (1.1) on the design with regression function . Then,
[TABLE]
In particular, the experiments and are not asymptotically equivalent.
Remark 5.2*.*
The statement of Theorem 5.1 is in coincidence with usual embedding theorems which state that and can be continuously embedded into the space of continuous functions on the sphere provided that and , respectively (see [NPW06] and [Bal+09] for details). In this light, the requirements and are natural for any method based on function values. The proof of Theorem 5.1 even shows that asymptotic equivalence does not hold if one restricts the function spaces to functions having a finite series expansion (in the proof, we consider for some with ).
6 Discussion
We have proven that non-parametric regression with spherical regressors is asymptotically equivalent, in the sense of Le Cam, to a corresponding Gaussian white noise experiment. The results hold for both the fixed design case, where the sampling points form a spherical -design, and the random uniform design case. As special cases of function classes, for which asymptotic equivalence holds, Sobolev and Besov balls were considered and the smoothness assumptions imposed to establish asymptotic equivalence over these function spaces were shown to be sharp.
The derived results suggest that both spherical -designs and random uniform designs are a good choice of sampling points in non-parametric regression as the resulting statistical experiment is asymptotically equivalent to the Gaussian white noise model which is usually regarded as the simplest model of the form . The established symptotic equivalences suggest that known results for the Gaussian white noise model on the sphere (for instance, the sharp minimax results from [Kle99] already cited in the introduction) are valid also for both the fixed and the random design regression framework with spherical regressors. Besides estimation procedures as considered in [Kle99] also (sharp) non-parametric testing rates and confidence bands might now be transferable from the idealized Gaussian white noise model to the more realistic regression models.
The interpretation of the considered designs as a good choice of sampling points is in line with existing results on optimal designs in finite-dimensional linear regression models. For truncated expansions in spherical harmonics, the papers [DMP05] (for the two-dimensional sphere) and [Det+19] (for spheres of arbitrary dimension) identify the uniform distribution as an optimal design distribution with respect to the -criteria introduced by Kiefer in [Kie74]. Such a design given by an absolutely continuous distribution cannot be implemented directly in practice. In [Det+19], Remark 3.1, the authors mention spherical -designs as a potential remedy to address this issue. Even more recently, [Hai24] discusses the use of spherical -designs as optimal designs for the special case of the two-dimensional sphere. So-called -designs, which extend the notion of spherical -designs to Riemannian manifolds, are identified as optimal designs for regression on Lie groups in [CDK26].
In this work, we have restricted ourselves to the case of spheres of arbitrary dimension for two reasons: (i) the restriction to this special case keeps the technical jargon to a minimum but already incorporates all the essential ideas, (ii) the case of regression with spherical regressors, especially for the two-dimensional sphere , is certainly the most relevant one in applications. We conjecture that main parts of our analysis can be carried over to non-parametric regression on more general compact Riemannian manifolds. Regression models on manifolds and the presumably equivalent Gaussian white noise model are, for instance, considered in [CKP14] in the context of non-parametric Bayesian estimation. Another work in this direction is [KNP11] where confidence bands for needlet density estimators on compact homogeneous manifolds have been derived.
7 Proofs
7.1 Proof of Theorem 3.1
We first derive a uniform upper bound on , that is, the Le Cam distance between the regression experiment and the intermediate experiment . For this, we first consider the deficiency . Assume that an observation from the regression experiment is given. Setting for , we define the random process
[TABLE]
The random variables are Gaussian with mean zero. Combining the inclusion with the assumption that is a spherical -design for implies that
[TABLE]
By linearity, this shows that the process is standard Gaussian white noise on . By adding Gaussian white noise, scaled by the same factor , on the -orthogonal complement of , we obtain the process
[TABLE]
where denotes a standard Gaussian white noise process. In differential notation, the process can equivalently be written as
[TABLE]
showing that
[TABLE]
Conversely, given the continuous observation (3.6) from the intermediate experiment and choosing an orthonormal basis of , the random vector with
[TABLE]
follows a multivariate Gaussian distribution with mean
[TABLE]
and covariance matrix . Consider the design matrix . The vector of fitted values at the sampling points follows a multivariate Gaussian with mean vector
[TABLE]
and covariance matrix . Consider a mean zero multivariate Gaussian random vector with covariance matrix
[TABLE]
independent of (the matrix corresponds to the covariance matrix of the residuals in the linear model given by and is therefore positive semi-definite; the fact that the design points form a spherical -design then implies that ). It follows that the vector follows a multivariate Gaussian with mean (7.2) and covariance matrix . Hence, the deficiency can be bounded by taking the supremum over all of the total variation distance between the multivariate Gaussian vectors and ,
[TABLE]
see, for instance, [NO24]. Combining (7.1) and (7.3) yields
[TABLE]
Next, we derive a uniform upper bound on the Le Cam distance between the intermediate experiment and the Gaussian white noise experiment . Denote by the distribution of the process in the experiment , and by the distribution of the process in the experiment , respectively. Then, (2.5) yields
[TABLE]
from which it follows that
[TABLE]
By combining (7.4) and (7.5) the bound for announced in the theorem follows from the triangle inequality for the Le Cam distance.
7.2 Proof of Theorem 3.7
In the following, we treat the terms and separately under the assumption that is a spherical -design with .
Bound for when
Recall that is defined by (3.8) in the statement of Theorem 3.7. We have
[TABLE]
Putting
[TABLE]
for we can write
[TABLE]
and the series on the right-hand side converges uniformly for when . Using Lemma 7.1 with (note that by definition), we obtain for that
[TABLE]
As a consequence, we obtain by means of the triangle inequality that, for ,
[TABLE]
It follows that for , the term in Theorem 3.1 can be bounded as
[TABLE]
Bound for when
Under the assumptions of Theorem 3.1, applying [LW23], Theorem 1.2, Equation (1.6), yields directly that, for with ,
[TABLE]
Combining the bounds (7.6) and (7.7) implies the statement of Theorem 3.7.
7.3 Proof of Theorem 3.8
As for the Sobolev case, the proof of Theorem 3.8 relies on finding suitable upper bounds for the quantities and in the proof of Theorem 3.1. In the following analysis, we assume that . The result of the theorem equally holds true for the case and directly follows from the case via the Besov embedding for (see [Bal+09], Theorem 5).
Bound for when
With taking the role of in Theorem 3.1, that is, setting , we obtain
[TABLE]
To bound uniformly for , we use the decomposition
[TABLE]
where we have used that is a spherical -design with . Below, in the analysis of , we will show that
[TABLE]
From [Bal+09], p. 3383, we obtain
[TABLE]
Finally, by Lemma 7.1 and again the estimate from [Bal+09], p. 3383, we obtain that
[TABLE]
Combining (7.8), (7.9), and (7.10), we obtain
[TABLE]
Bound for when
We now derive a bound for . Using Equation (3.15), we obtain
[TABLE]
where we use both the fact that the equal-weight cubature rule associated with the spherical -design is exact for polynomials of degree and identity (3.12). It follows that
[TABLE]
From [Bal+09], we immediately obtain that the first term on the right-hand side can be bounded as
[TABLE]
Let us now consider the second term. [Wan+17], Theorem 3.3, yields that for the inequality
[TABLE]
holds with a constant that does neither depend on nor . Using this inequality yields
[TABLE]
where we use Lemma 7.1 twice (in each case with but first with and then with ). Hence,
[TABLE]
Combining the bounds (7.12) and (7.13), we obtain, uniformly for ,
[TABLE]
which implies that
[TABLE]
Combining the bounds (7.11) and (7.14) finishes the proof of Theorem 3.8.
7.4 Proof of Theorem 4.1
As in the fixed design case we denote with the empirical inner product defined by
[TABLE]
and write for the associated norm. denotes the orthogonal projection on with respect to .
We first notice that the random design regression experiment given by Equation (4.1) is asymptotically equivalent to the experiment with (4.1) replaced by
[TABLE]
Indeed, by conditioning on the design , we have for and the bound
[TABLE]
Using the conditioning property for -divergences (see [PW24], p. 117) and Jensen’s inequality, we obtain, uniformly for ,
[TABLE]
Hence,
[TABLE]
Analogously, we bound the Le Cam distance between the Gaussian white noise experiment and the truncated experiment given by
[TABLE]
In this case, a direct application of inequality (2.5) yields that
[TABLE]
After this reduction to two experiments with truncated parameter, it remains to find a bound for the Le Cam distance , that is, we can assume without loss of generality that for the rest of the proof without further reference.
In order to obtain the bound for we consider three further intermediate experiments (denoted by , , and below). To state these experiments, we first introduce some notation. Define the event
[TABLE]
Similar to the proof of Theorem 3.1, we denote with the design matrix where for the enumeration function introduced in Section 2.1 and are spherical harmonics up to resolution level , that is, and are an -orthonormal basis of . If the matrix has full column rank (this is always the case on ), we apply the generalized thin decomposition (described in detail in Section 7.6.3) to obtain a matrix with orthonormal columns and an upper triangular block matrix such that
[TABLE]
Here, the choice of the blocks is as in Proposition 7.3, that is, the diagonal blocks are in one-to-one correspondence with the eigenspaces , . We define functions by
[TABLE]
The fact that is a block lower triangular matrix implies that for . Consider the mapping defined by for . Since the matrix in the generalized thin decomposition has orthonormal columns, are orthonormal with respect to the empirical scalar product and is an isometry. More precisely, we have
[TABLE]
where is the entry of in the -th row and -th column.
Set for the intermediate resolution level . Departing from observations in the random uniform design regression experiment with regression function , we define, given the design , a first intermediate continuous experiment with observation given by
[TABLE]
where is standard Gaussian white noise on the complement of . Note that the observations in the experiment and in the experiment can be transferred into one another provided that has full column rank. Obviously, (7.16) defines in terms of under this assumption. Vice versa, observations following the same distribution as can be generated from (7.16) by transforming the high-frequency part by application of the mapping (that is, is replaced with ) and then using the same argument as in the proof of Theorem 3.1. Defining Markov kernels between the underlying measurable spaces arbitrarily on the null set where does not have full column rank then implies .
In addition to , we define two further intermediate experiments and , that are both defined, conditional on the design , by observations and defined as follows:
[TABLE]
Here, both and denote standard Gaussian white noise on . In the following, we will bound the Le Cam distance by means of the triangle inequality,
[TABLE]
In order to bound these three terms, it is sufficient to work on the event , since by (7.26) we have the bound
[TABLE]
(of course, the terms and can be treated analogously). It remains to find appropriate bounds for the terms
[TABLE]
and
[TABLE]
which will finish the proof. Before we consider these three terms separately, we remember that the transformations used to define , , and are only well-defined if has full column rank. As already mentioned above, this condition holds true on the event . If does not have full column rank, one can define the Markov kernels that transform between the considered experiments arbitrarily.
Bound on
Following the notation introduced in Section 7.6.3, we denote with the submatrix of consisting only of the first columns. Note that can be written as
[TABLE]
where the mean zero vector is defined by
[TABLE]
for and has covariance matrix
[TABLE]
The processes and have the same mean but different covariance structure, and applying [DMR18], Theorem 1.1, combined with Equation (2) from the same reference, we obtain that
[TABLE]
where denotes the Frobenius norm of a matrix. Taking expectations and using the addition formula (2.3) we obtain that
[TABLE]
Bound on
Note that on the event we have that
[TABLE]
By combining this bound with Equation (2.5), we have
[TABLE]
where . Using the addition formula (2.3), we obtain
[TABLE]
from which we conclude that
[TABLE]
Bound on
We have
[TABLE]
By (7.15), the matrix corresponding to the mapping (in terms of the spherical harmonics ) is the block upper triangular matrix which implies that
[TABLE]
Hence,
[TABLE]
Combining the identity
[TABLE]
for , , with Proposition 7.3 yields that
[TABLE]
We decompose
[TABLE]
The Cauchy-Schwarz inequality yields
[TABLE]
Combining the Cauchy-Schwarz inequality, the estimate , and the fact that is an isometry yields that
[TABLE]
Hence, by combining the last estimate with Equation (7.26), we get
[TABLE]
Identifying the projection with the coefficient vector
[TABLE]
we can write
[TABLE]
where the matrix is defined as
[TABLE]
On we have
[TABLE]
which implies that the spectrum of is contained in the interval . Consequently, the spectrum of is contained in on . Using the Taylor series expansion of the matrix function (which converges for with spectrum contained in ; see [Hig08], Theorem 4.7), we obtain that
[TABLE]
By taking expectations we obtain
[TABLE]
First, since , we obtain
[TABLE]
Second, by (7.26), Proposition 7.4, and Proposition 7.5, (b),
[TABLE]
where, according to Proposition 7.4, we can take
[TABLE]
Hence, for ,
[TABLE]
Third, the bound
[TABLE]
yields
[TABLE]
from which we obtain that
[TABLE]
We have
[TABLE]
Note that the non-zero eigenvalues of coincide with those of and that on the event the positive eigenvalues of are bounded by . It follows that
[TABLE]
By Proposition 7.5, (a), we thus obtain
[TABLE]
By combining all the obtained estimates, we have
[TABLE]
Thus
[TABLE]
finishing the proof of the theorem.
7.5 Proof of Theorem 5.1
For the experiments and , we reduce the common parameter set to for suitably defined . For any and both or , we take (which trivially belongs to both of the smoothness classes considered). In order to define , we first choose an integer such that for some suitable constant . Consider the linear subspace of defined by
[TABLE]
We have
[TABLE]
Now, by [DW13], Proposition 3.5, there exists a function such that
[TABLE]
for all . Based on these preliminaries we now construct such that
[TABLE]
Case
We make the ansatz
[TABLE]
with a constant independent of the value of which will be specified now. Since , we have by the choice of combined with (7.17) that
[TABLE]
showing that for sufficiently small. Moreover,
[TABLE]
Case with
As in the previous case, we put
[TABLE]
with a constant independent of to be chosen appropriately. Let be such that . Note that for any and for . Then, using the equivalence (3.14) of norms, we obtain
[TABLE]
Hence provided that is chosen sufficiently small. Moreover, as in the previous case,
[TABLE]
Case with
Let be such that , and put
[TABLE]
with as defined by (3.11) and (3.13). As in the previous cases, the constant has to be chosen sufficiently small and independent of . On the one hand
[TABLE]
on the other hand
[TABLE]
and hence . We have
[TABLE]
By [NPW06], Proposition 2.5, we obtain that, for ,
[TABLE]
where we used (7.17) and [Wan+17], Theorem 3.3, in the last estimate. Using (7.19) and [NPW06], Proposition 2.5, again, we have
[TABLE]
Combining the last estimate with (7.20), we obtain
[TABLE]
also for . Using (7.21), we first obtain with being chosen as in the previous case that
[TABLE]
hence for sufficiently small. Moreover, also from (7.21) we directly obtain
[TABLE]
which is (7.18) for and .
Having established (7.18) in all three cases of interest, we can now derive (5.1). By the very definition of we have , which trivially implies that
[TABLE]
for any . Combining (7.18) with (2.5), however, shows that
[TABLE]
for all the considered cases. [RSH19], Lemma 1, states that
[TABLE]
Plugging (7.22) and (7.23) into (7.24) implies that and hence the asymptotic non-equivalence of and .
7.6 Auxiliary results
7.6.1 A consequence of the Marcinkiewicz-Zygmund condition
Our asymptotic equivalence results for specific function classes stated in Sections 3.2.1 and 3.2.2 rely on recent contributions concerning numerical integration on spheres, for instance, from [LW23]. The main assumption in [LW23] is the validity of a certain Marcinkiewicz-Zygmund condition on the cubature points (see [LW23], Equation (1.3)). The following technical lemma, which is used in the proofs of Theorems 3.7 and 3.8, is essentially based on [Dai06], Theorem 2.1. Our formulation is slightly closer to the version stated in [LW23], Lemma 3.1. The condition (7.25) in the lemma is always satisfied for spherical -designs when taking , , (in fact, in this case even equality holds in (7.25)). The assertion of Lemma 7.1 then allows to bound the value of the cubature formula (which is exact on ) from above by the target integral also for truncated spherical harmonics expansions where the cubature is not exact any more (the more coefficients are involved, the less accurate the bound becomes). This kind of bound is used in the proofs of Theorems 3.7 and 3.8, respectively, to control the empirical norms appearing in the term of Theorem 3.1.
Lemma 7.1**.**
Let and be weights satisfying, for some and some ,
[TABLE]
where is a numerical constant. Then, for any and any integer ,
[TABLE]
with a numerical constant depending only on and .
7.6.2 Bound for
The defining property of the event ,
[TABLE]
is equivalent to
[TABLE]
where denotes the design matrix defined in the Proof of Theorem 4.1. Applying Theorem 1 from [CDL13] (taking into account the corrected numerical constants from [CDL18]) with (leading to the choice in the statement of that theorem) and choosing the maximal such that for
[TABLE]
we obtain the estimate
[TABLE]
In Theorem 4.1 and its proof, we choose which leads to .
7.6.3 Generalized thin decomposition
For the proof of Theorem 4.1 we need a generalization of the classical thin decompostion of a rectangular matrix with full column rank (see [GVL13], Theorem 5.2.3). Since we are not aware of a reference for this kind of generalization, we state it here in full detail (a proof can be obtained by an obvious adjustment of the one in the classical case). Recall that a square matrix is called positive definite if the following two conditions hold:
- (i)
is symmetric, that is, , 2. (ii)
for all with .
Here and in the proof of Theorem 4.1 (given in Section 7.4), we will use underlined indices like in order to refer to the indices belonging to the block with index . For instance, in the following theorem is a shorthand for the indices running from to .
Theorem 7.2** (Generalized thin decomposition).**
Suppose has full column rank. Let such that . Then, the generalized thin decomposition
[TABLE]
is unique where has orthonormal columns and with satisfies the following properties:
- (i)
* if and* 2. (ii)
* is positive definite.*
Proof.
The existence part of Theorem 7.2 can be proven in a constructive way, for instance by adapting the classical Gram-Schmidt algorithm (see [GVL13], Section 5.2.7) in a suitable manner. The resulting algorithm is stated in Algorithm 1. For a positive definite matrix , we call a decomposition
[TABLE]
where with blocks a (generalized) Cholesky decomposition if the following properties are satisfied:
- (i)
if , 2. (ii)
is positive definite.
As for the usual Cholesky decomposition (which corresponds to the case and ) the generalized Cholesky decomposition is uniquely determined given and and the factor is referred to as the Cholesky factor. Note that for the factor in the generalized thin decomposition is the Cholesky factor in the generalized Cholesky factorization of the symmetric positive definite matrix (of course, for the same choice of and ). The uniqueness of the generalized thin decomposition follows from the uniqueness of the Cholesky factor combined with . ∎
In the following proposition we state key stochastic properties of the block upper triangular matrix in the generalized thin decomposition (7.27).
Proposition 7.3**.**
Let be defined as the block upper triangular matrix with positive definite diagonal blocks in the generalized thin decomposition of the design matrix , where are all the spherical harmonics up to the resolution level (consequently, ), the numbers in the generalized thin decomposition are chosen as , and are i.i.d. . Then,
- (a)
, , 2. (b)
, with .
Proof.
Let be a random rotation distributed according to the Haar measure on and set (here, is should be interpreted as a -matrix and as an element of ). Then, are i.i.d. . Denote with the corresponding design matrix. Note that by [Gro96], Proposition 3.2.4, can be written as
[TABLE]
where is a block matrix with if , and is orthogonal. Let be the generalized thin decomposition of which is related to the decomposition of via
[TABLE]
that is, . Since and have the equal law, we have
[TABLE]
Combining Schur’s lemma (see [Ser77], Proposition 4 and Corollary 1, which carry over to the compact group case as explained in Section 4.3 of that reference) with the fact that the spaces , , are in one-to-one correspondence with the irreducible representations of (see [DX13], Theorem 1.7.2) then implies both assertions (a) and (b). ∎
7.6.4 A reverse Hölder inequality for spherical harmonics
The following result, which provides even (slightly) stronger statements than what we need in the proof of Theorem 4.1, gathers some special instances of reverse Hölder inequalities for spherical harmonics from [DFT16].
Proposition 7.4**.**
Let and . Then,
[TABLE]
Proof.
Note that . From this, the claimed assertions follow from Theorem 1.1 in [DFT16]: The cases and follow from assertions (iv) and (i) of that theorem, respectively. The case , however, follows form assertion (ii). ∎
7.6.5 Further estimates
Part (a) of the following proposition is an adaption of [Rei08], Proposition 4.9, to our setup.
Proposition 7.5**.**
Let . Then, the following assertions hold true:
- (a)
, 2. (b)
.
Proof.
We have
[TABLE]
which proves (a). For the proof of (b), we use the decomposition
[TABLE]
For the first term on the right-hand side, we have by (a) and the defining property of the event that
[TABLE]
The second term on the right-hand side is bounded as
[TABLE]
Combining the two obtained bounds implies (b). ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[ANS 96] Peter Alfeld, Marian Neamtu and Larry L. Schumaker “Fitting scattered data on sphere-like surfaces using spherical splines” In J. Comput. Appl. Math. 73.1–2 Elsevier BV, 1996, pp. 5–43 DOI: 10.1016/0377-0427(96)00034-9 · doi ↗
- 2[Ave 93] John Avery “Selected applications of hyperspherical harmonics in quantum theory” In J. Phys. Chem. 97.10 American Chemical Society (ACS), 1993, pp. 2406–2412 DOI: 10.1021/j 100112 a 048 · doi ↗
- 3[Bal+09] P. Baldi, G. Kerkyacharian, D. Marinucci and D. Picard “Adaptive density estimation for directional data using needlets” In Ann. Statist. 37.6A Institute of Mathematical Statistics, 2009 DOI: 10.1214/09-aos 682 · doi ↗
- 4[BB 09] Eiichi Bannai and Etsuko Bannai “A survey on spherical designs and algebraic combinatorics on spheres” In Eur. J. Combin. 30.6 Elsevier BV, 2009, pp. 1392–1425 DOI: 10.1016/j.ejc.2008.11.007 · doi ↗
- 5[BD 79] E. Bannai and R. M. Damerell “Tight spherical designs. I” In J. Math. Soc. Jpn. 31.1 , 1979, pp. 199–207 DOI: 10.2969/jmsj/03110199 · doi ↗
- 6[BG 15] Johann S. Brauchart and Peter J. Grabner “Distributing many points on spheres: Minimal energy and designs” In J. Complexity 31.3 , 2015, pp. 293–326 DOI: 10.1016/j.jco.2015.02.003 · doi ↗
- 7[BL 96] Lawrence D. Brown and Mark G. Low “Asymptotic equivalence of nonparametric regression and white noise” In Ann. Statist. 24.6 , 1996, pp. 2384–2398 DOI: 10.1214/aos/1032181159 · doi ↗
- 8[Bro+02] Lawrence D. Brown, T. Tony Cai, Mark G. Low and Cun-Hui Zhang “Asymptotic equivalence theory for nonparametric regression with random design” Dedicated to the memory of Lucien Le Cam In Ann. Statist. 30.3 , 2002, pp. 688–707 DOI: 10.1214/aos/1028674838 · doi ↗
