Explicit error bounds for randomized Smolyak algorithms and an application to infinite-dimensional integration
Michael Gnewuch, Marcin Wnuk

TL;DR
This paper analyzes randomized Smolyak algorithms, providing explicit error bounds and demonstrating their effectiveness in high-dimensional and infinite-dimensional integration problems, with applications to weighted reproducing kernel Hilbert spaces.
Contribution
It introduces explicit error bounds for randomized Smolyak algorithms and applies these results to infinite-dimensional integration, highlighting when randomized methods outperform deterministic ones.
Findings
Derived upper and lower error bounds with explicit dependence on variables and evaluations.
Established convergence rates for N-th minimal errors in infinite-dimensional integration.
Characterized spaces where randomized algorithms outperform deterministic counterparts.
Abstract
Smolyak's method, also known as hyperbolic cross approximation or sparse grid method, is a powerful tool to tackle multivariate tensor product problems solely with the help of efficient algorithms for the corresponding univariate problem. In this paper we study the randomized setting, i.e., we randomize Smolyak's method. We provide upper and lower error bounds for randomized Smolyak algorithms with explicitly given dependence on the number of variables and the number of information evaluations used. The error criteria we consider are the worst-case root mean square error (the typical error criterion for randomized algorithms, often referred to as "randomized error") and the root mean square worst-case error (often referred to as "worst-case error"). Randomized Smolyak algorithms can be used as building blocks for efficient methods such as multilevel algorithms, multivariate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Explicit error bounds for randomized Smolyak algorithms and an application to infinite-dimensional integration
Michael Gnewuch Institut für Mathematik, Universität Osnabrück, Germany ([email protected])
Marcin Wnuk Mathematisches Seminar, Christian-Albrechts-Universität zu Kiel, Germany ([email protected]).
Abstract
Smolyak’s method, also known as hyperbolic cross approximation or sparse grid method, is a powerful tool to tackle multivariate tensor product problems solely with the help of efficient algorithms for the corresponding univariate problem.
In this paper we study the randomized setting, i.e., we randomize Smolyak’s method. We provide upper and lower error bounds for randomized Smolyak algorithms with explicitly given dependence on the number of variables and the number of information evaluations used. The error criteria we consider are the worst-case root mean square error (the typical error criterion for randomized algorithms, often referred to as “randomized error”) and the root mean square worst-case error (often referred to as “worst-case error”).
Randomized Smolyak algorithms can be used as building blocks for efficient methods such as multilevel algorithms, multivariate decomposition methods or dimension-wise quadrature methods to tackle successfully high-dimensional or even infinite-dimensional problems. As an example, we provide a very general and sharp result on the convergence rate of -th minimal errors of infinite-dimensional integration on weighted reproducing kernel Hilbert spaces. Moreover, we are able to characterize the spaces for which randomized algorithms for infinte-dimensional integration are superior to deterministic ones. We illustrate our findings for the special instance of weighted Korobov spaces. We indicate how these results can be extended, e.g., to spaces of functions whose smooth dependence on successive variables increases (“spaces of increasing smoothness”) and to the problem of -approximation (function recovery).
1 Introduction
Smolyak’s method or algorithm, also known as sparse grid method, hyperbolic cross approximation, Boolean method, combination technique or discrete blending method, was outlined by Smolyak in [56]. It is a general method to treat multivariate tensor product problems. Its major advantage is the following: to tackle a multivariate tensor product problem at hand one only has to understand the corresponding univariate problem. More precisely, Smolyak’s algorithm uses algorithms for the corresponding univariate problem as building blocks, and it is fully determined by the choice of those algorithms. If those algorithms for the univariate problem are optimal, then typically Smolyak’s algorithm for the multivariate problem is almost optimal, i.e., its convergence rate is optimal up to logarithmic factors.
Today Smolyak’s method is widely used in scientific computing and there exists a huge number of scientific articles dealing with applications and modifications of it. A partial list of papers (which is, of course, very far from being complete) on deterministic Smolyak algorithms may contain, e.g., the articles [64, 65] for general approximation problems, [16, 6, 58, 3, 13, 46, 17, 18, 51, 26, 31] for numerical integration, [28, 5, 57, 59, 53, 62, 11] for function recovery, and [50, 70, 45, 68, 69, 14, 15] for other applications. Additional references and further information may be found in the survey articles [4, 29], the book chapters [47, Chapter 15], [60, Chapter 4], and the books [7, 12].
On randomized Smolyak algorithms much less is known. Actually, we are only aware of two articles that deal with randomized versions of Smolyak’s method, namely [10] and [34]. In [10] Dick et al. investigate a specific instance of the randomized Smolyak method and use it as a tool to show that higher order nets may be used to construct integration algorithms achieving almost optimal order of convergence (up to logarithmic factors) of the worst case error in certain Sobolev spaces. In [34] Heinrich and Milla employ the randomized Smolyak method as a building block of an algorithm to compute antiderivatives of functions from allowing for fast computation of antiderivative values for any point in Note that in both cases the randomized Smolyak method is applied as an ad hoc device and none of the papers gives a systematic treatment of its properties.
With this paper we want to start a systematic treatment of randomized Smolyak algorithms. Similar to the paper [64], where deterministic Smolyak methods were studied, we discuss the randomized Smolyak method for general linear approximation problems on tensor products of Hilbert spaces. Examples of such approximation problems are numerical integration or -approximation, i.e., function recovery.
The error criteria for randomized algorithms or, more generally, randomized operators that we consider are extensions of the worst-case error for deterministic algorithms. The first error criterion is the worst-case root mean square error, often referred to as “randomized error”. This error criterion is typically used to assess the quality of randomized algorithms. The second error criterion is the root mean square worst-case error, often referred to as “worst-case error”. This quantity is commonly used to prove the existence of a good deterministic algorithm with the help of the “pidgeon hole principle”: It arises as an average of the usual deterministic worst case error over a set of deterministic algorithms endowed with a probability measure . If the average is small, there exists at least one algorithm in with small worst-case error, see, e.g., [10] or [55]. Notice that the pair can be canonically identified with a randomized algorithm.
We derive upper error bounds for both error criteria for randomized Smolyak algorithms with explicitly given dependence on the number of variables and the number of information evaluations used. The former number is the underlying dimension of the problem, the latter number is typically proportional to the cost of the algorithm. The upper error bounds show that the randomized Smolyak method can be efficiently used at least in moderately high dimension. We complement this result by providing lower error bounds for randomized Smolyak algorithms that nearly match our upper bounds.
As in the deterministic case, our upper and lower error bounds contain logarithmic factors whose powers depend linearly on the underlying dimension , indicating that the direct use of the randomized Smolyak method in very high dimension may be prohibitive. Nevertheless, our upper error bounds shows that randomized Smolyak algorithms make perfect building blocks for more sophisticated algorithms such as multilevel algorithms (see, e.g., [33, 35, 19, 20, 21, 36, 43, 22, 2, 8, 39]), multivariate decomposition methods (see, e.g., [40, 52, 63, 23, 8, 9]) or dimension-wise quadrature methods (see [30]). We demonstrate this in the case of the infinite-dimensional integration problem on weighted tensor products of reproducing kernel Hilbert spaces with general kernels. We provide the exact polynomial convergence rate of -th minimal errors—the corresponding upper error bound is established by multivariate decomposition methods based on randomized Smolyak algorithms.
The paper is organized as follows: In Section 2 we provide the general multivariate problem formulation and illustrate it with two examples. In Section 3 we introduce the randomized multivariate Smolyak method building on randomized univariate algorithms. Our assumptions about the univariate randomized algorithms resemble the ones made in [64] in the deterministic case. In Remark 2 we observe that we may identify our randomized linear approximation problem of interest with a corresponding deterministic -approximation problem.
In Section 4 we follow the course of [64] and establish first error bounds in terms of the underlying dimension of the problem and the level of the considered Smolyak algorithm, see Theorem 3 and Corollary 4. For the randomized error criterion Remark 2 helps us to boil the error analysis of the randomized Smolyak method down to the error analysis of the deterministic Smolyak method provided in [64]. For the worst case error criterion Remark 2 is of no help and therefore we state the details of the analysis.
Up to this point we consider general randomized operators to approximate the solution we are seeking for. In Section 5 we focus on randomized algorithms and the information evalutions they use. In Theorem 18 we present upper error bounds for the randomized Smolyak method where the dependence on the underlying dimension of the problem and on the number of information evalutions is revealed. In Corollary 14 we provide lower error bounds for the randomized Smolyak method.
In Section 6 we apply our upper error bounds for randomized Smolyak algorithms to the infinite integration problem. After introducing the setting, we provide the exact polynomial convergence rate of -th minimal errors in Theorem 18. In Corollary 19 we compare the power of randomized algorithms and deterministic algorithms for infinite-dimensional integration, and in Corollary 20 we illustrate the result of Theorem 18 for weighted Korobov spaces. In Remark 21 and Remark 22 we discuss previous contributions to the considered infinite integration problem and generalizations to other settings such as to function spaces with increasing smoothness or to the -approximation problem.
In the appendix we provide for the convenience of the reader a self-contained proof of a folklore result on the convergence rates of randomized algorithms on Korobov spaces.
2 Formulation of the Problem
Let . For let be a separable Hilbert space of real valued functions, be a separable Hilbert space, and be a continuous linear operator. We consider now the tensor product spaces and given by
[TABLE]
[TABLE]
and the tensor product operator (also called solution operator) given by
[TABLE]
We frequently use results concerning tensor products of Hilbert spaces and tensor product operators without giving explicit reference, for details on this subject see, e.g., [67]. We denote the norms in and by and respectively, and the norms in and simply by . Furthermore, denotes the space of all bounded linear operators between and
may be approximated by randomized linear algorithms or, more generally, by randomized linear operators. We define a randomized linear operator to be a mapping
[TABLE]
such that is a random variable for each ; here is some probability space and is endowed with its Borel field. We put
[TABLE]
Obviously one may interpret deterministic bounded linear operators as randomized linear operators with trivial dependance on . Accordingly, we put
[TABLE]
where the inclusion is based on the identification of with the constant mapping .
A *(randomized) linear approximation problem * is given by a quadruple where denotes the class of admissible randomized linear operators. We are mainly interested in results for randomized linear algorithms, which constitute a subclass of and will be introduced in Section 5.
Consider a randomized linear operator meant to approximate . The randomized error of the operator is given by
[TABLE]
and the* (root mean square) worst case error* is
[TABLE]
Clearly we have .
Notice that for a deterministic linear Operator both errors coincide with the deterministic worst case error
[TABLE]
i.e., .
We finish this section by giving two examples of typical tensor product problems that fit into the framework given above.
Example 1**.**
For let be an arbitrary domain and let be a positive measure on . Denote by the Cartesian product and by the product measure on .
- (i)
By choosing , , and to be the integration functional , we obtain , , and is the integration functional on given by
[TABLE]
The integration problem is now to compute or approximate for given the integral .
- (ii)
By choosing and to be the embedding operator from into , we obtain and is the embedding operator from into given by
[TABLE]
The -approximation problem is now to reconstruct a given function , i.e., to compute or approximate ; the reconstruction error is measured in the -norm.
Note that in both problem formulations above the phrase “a given function ” does not necessarilly mean that the whole function is known. Usually there is only partial information about the function available (like a finite number of values of the function or of its derivatives or a finite number of Fourier coefficients) available. We discuss this point in more detail in Section 5.1.
3 Smolyak Method for Tensor Product Problems
From now on we are interested in randomizing the Smolyak method which is to be defined in this section. Assume that for every we have a sequence of randomized linear operators which approximate the solution operator such that for every it holds: is a random variable on a probability space We shall refer to separate as to building blocks.
Put . We denote
[TABLE]
and
[TABLE]
Note that if , then . For the randomized Smolyak method of level L approximating the tensor product problem is given by
[TABLE]
We would like to stress that due to the definition of the probability space for given the families are mutually independent. Note that for the Smolyak method is the zero operator. Therefore, we will always assume (without stating it explicitly every time) that
It can be verified that the following representation holds
[TABLE]
cf. [64, Lemma 1]. When investigating the randomized error we need that for every and
[TABLE]
In the worst case error analysis we require for every and
[TABLE]
and that is measurable with
[TABLE]
Let When considering the error we assume that there exist constants and such that for every and every
[TABLE]
[TABLE]
and additionally in the randomized setting
[TABLE]
and in the worst case setting
[TABLE]
Note that (9) implies the conditions (10) and (11) with a constant for all Still (10) and (11) may even hold for some smaller
Remark 2**.**
For our randomized error analysis it would be convenient to identify a randomized linear operator separable Hilbert spaces, with the mapping (12) which we again denote by
[TABLE]
We now show that this identification makes sense for all the operators we are considering. We start with the building blocks From (10) we obtain
[TABLE]
implying for all The building blocks are obviously linear as mappings and, due to (13), also bounded, i.e. continuous. Now, since for arbitrary sample spaces and separable Hilbert spaces it holds
[TABLE]
we have that lies in for Clearly, the tensor product operator is a bounded linear mapping . Since due to (4) the Smolyak method may be represented as a finite sum of such tensor product operators, it is also a bounded linear map
If we formally consider as an operator , f\mapsto\big{(}\omega\mapsto S_{d}f\big{)} (i.e., an operator that maps into the constant -functions), then is still linear and continuous with operator norm
[TABLE]
and the usual randomized error can be written as
[TABLE]
The worst case error unfortunately does not allow for a representation as operator norm similar to (14).
Note that the above identification turns a randomized approximation problem
[TABLE]
into a deterministic -approximation problem
[TABLE]
4 Error Analysis in Terms of the Level
We now perform the error analysis of the approximation of by the Smolyak method in terms of the level which may be done under the rather general assumptions of Sections and
Theorem 3**.**
For let be a randomized Smolyak method as described in Section 3. Let . Assume (8), (9) and, dependently on the setting, for additionally assume (5),(10) and for additionally assume (6), (7) and (11). Then we have
[TABLE]
where
Proof.
The second inequality in (15) follows easily by using and estimating so all there remains to be done is proving the first inequality.
Firstly we shall focus on the worst case error bound. Note that for a fixed
[TABLE]
Now we may proceed similarly as in the proof of Lemma from [64], by induction on for and . For and any we have and so the statement is just the condition (9). Suppose we have already proved the claim for and want to prove it for Using
[TABLE]
and Minkowski’s inequality we get
[TABLE]
We use Minkowski’s inequality, properties of tensor product operator norms, the fact that component algorithms are randomized independently for different , (9) and (11) to obtain
[TABLE]
Furthermore, using (8),
[TABLE]
Therefore we have
[TABLE]
and using the induction hypothesis finishes the proof for the worst case error.
Now consider the randomized error. By similar calculations as in the first part of the proof one could show that the claim holds true for the randomized error for elementary tensors. Then however, one encounters problems trying to lift it to the whole Hilbert space. The difficulty lies in the fact that the randomized error is not an operator norm of some tensor product operator, which would have enabled us to write it as a product of norms of the corresponding univariate operators and which has proved to be useful in bounding the worst case error. To get round it we need a different approach. The idea is to interpret a randomized problem as a deterministic approximation problem. As already explained in the Remark 2 we may identify with an operator again denoted by Then however and we may proceed exactly as in Lemma 2 from [64], which finishes the proof. ∎
We may generalize the result of the Theorem 3 by allowing for more flexibility in convergence rates in (9), (10) and (11). It can be used to capture additional logarithmic factors in the error bounds for the building blocks algorithms. This turns out to be particularly useful when investigating the error bounds for Smolyak methods whose building blocks are, e.g., multivariate quadratures or approximation algorithms, as it is the case in [10]. Suppose namely there exists a constant and non decreasing sequences of positive numbers such that for every
[TABLE]
Moreover, in case of the randomized error
[TABLE]
and in case of the worst case error
[TABLE]
It is now easy to prove Corollary 4 along the lines of the proof of Theorem 3.
Corollary 4**.**
For let be a randomized Smolyak method as described in Section 3. Let . Assume (8), (16) and, dependently on the setting, for assume (5),(17) and for assume (6), (7) and (18). Then we have
[TABLE]
where
Remark 5**.**
Note that applying Corollary 4 to the uni- or multivariate building block algorithms error bounds from [10] we may reproduce the error bounds obtained in this paper for the final (higher dimensional) Smolyak method.
5 Error Analysis in Terms of Information
5.1 Algorithms
Consider a linear approximation problem given by The aim of this section is to specify those linear operators that we want to call algorithms and to explain the typical information-based complexity framework for investigating the error of an algorithm in terms of the cardinality of information, for further reference see, e.g., [61]. To this end we shall specify a class of linear bounded functionals on called admissible information functionals and denoted by , which will become one more parameter of the approximation problem. Given a constant and, if , a collection of the information operator applied to is determined via
[TABLE]
Note that we are considering only non-adaptive information, meaning that the information functionals used do not depend on
A deterministic linear operator is called a deterministic linear algorithm if it admits a representation
[TABLE]
where is an information operator and is an arbitrary mapping. We denote the number of information functionals used by the deterministic algorithm for any input by i.e.,
[TABLE]
We denote the class of deterministic linear algorithms with admissible information functionals by
Let be an arbitrary sequence of algorithms and let be the information functionals used by We say that the sequence * uses nested information* if for every
[TABLE]
A randomized linear algorithm is a mapping
[TABLE]
such that
We denote the class of randomized linear algorithms with admissible information functionals by
For a randomized linear algorithm we may finally define
[TABLE]
We say that the information used by a sequence of randomized linear algorithms is nested if it is nested for each Note that the information used by is nested.
Now we would like to make some reasonable assumptions on the cost of building blocks of the Smolyak method. Consider a randomized Smolyak method as described in Section 3 with building blocks being randomized algorithms. Let
[TABLE]
Notice that For put
[TABLE]
Let us assume that for every the sequence is non-decreasing and that there exist constants such that for every it holds
[TABLE]
Note that this implies
[TABLE]
Example 6**.**
Consider the integration problem from Example 1. Let For let be Lebesgue measure on and be some reproducing kernel Hilbert space of functions defined on (e.g., a Sobolev space with sufficiently high smoothness parameter).
Choose a prime number and for and let be a scrambled net in base as introduced in [48]. Now
[TABLE]
is a randomized algorithm. Moreover, if we randomize in such a way that the families are independent then we may use them as building blocks of the Smolyak method and all the results of this paper apply, cf. also [10].
5.2 Upper Error Bounds
Throughout the whole section we require that the assumptions of Theorem 3 hold. Let us define where is as in (20) and is as in (9). We define the polynomial convergence rate of the algorithms by
[TABLE]
where It is straightforward to verify that for every Indeed,we have
[TABLE]
because of
[TABLE]
Hence for each the quantity is a lower bound on the polynomial order of convergence of the algorithms and can be chosen arbitrarily close to if the constants and in (9) are chosen appropriately.
The aim of this section is to develop upper bounds on the error of variate Smolyak method in terms of and More concretely we prove the following theorem.
Theorem 7**.**
Let Let be as above, and let the assumptions of Theorem 3 hold. Then there exist constants such that for all and all it holds
[TABLE]
and
[TABLE]
where is the cardinality of information used by the algorithm
To prove Theorem 7 we need a lemma bounding in terms of and
Lemma 8**.**
Let be as above. Put
[TABLE]
[TABLE]
[TABLE]
For every and it holds
[TABLE]
Moreover, if the building blocks of the Smolyak method use nested information then
[TABLE]
Proof.
We have
[TABLE]
Now following the steps of [64, Lemma 7] we obtain
[TABLE]
Now we provide a lower bound on Note that given the cardinality of information used by the building blocks, the cardinality of information used by the Smolyak method is minimal when the information used by the building blocks is nested for every coordinate. In this case the information used by the Smolyak method is exactly the information used by Let us fix The expected value of the cardinality of information used by and at the same time not used by any other with is
[TABLE]
We obtain
[TABLE]
The upper bound in the case when the building blocks use nested information follows in exactly the same manner on noting that
[TABLE]
∎
Proof.
(Theorem 7)
Note that so we have already showed the statement for in (24). It remains to consider the case Consider the function
[TABLE]
We will show that there exist constants such that for from Lemma 8 it holds
[TABLE]
and
[TABLE]
Now unimodality of combined with the fact that the extremum is a maximum yields finishing the proof.
First we prove (29). Calling upon Theorem 3 and using we get
[TABLE]
with constants not depending neither on nor on By Stirling’s formula we conclude
[TABLE]
Now we prove (30). To this end it suffices to prove that there exist constants independent of and such that
[TABLE]
i.e.,
[TABLE]
Note that
[TABLE]
so, putting we have
[TABLE]
Since obviously this shows (31) and finishes the proof of the theorem. ∎
5.3 Lower Error Bounds
In this subsection we make the following additional assumptions.
The first assumption states that there exist a sequence of instances of the problem that is genuinely univariate, i.e., there exists a sequence such that for which
[TABLE]
and the are exact on for .
Secondly, we assume that there exist constants such that for every
[TABLE]
Let us put
[TABLE]
with as in (33) and as in (20). Using (33) and (21) one easily sees that
[TABLE]
meaning that we have where is as in (22). Moreover, by choosing appropriately, can be made arbitrarily close to
Example 9**.**
The assumptions made in this subsection are quite naturally met for many important problems. Consider for instance an integration problem as described in Example 1, where may be any spaces containing constant functions. Then, for an appropriate (chosen so that the integration error does not converge too fast to [math]) we have that
[TABLE]
satisfies our assumptions for any randomized quadrature with weights adding up to
Lemma 10**.**
Let and let (32) and (33) hold. Then there exists a constant such that
[TABLE]
for all
If additionally (32) and (33) are satisfied for all with the same constants and and is bounded, then we may choose the constants in such a way that they are all equal.
Proof.
Choosing satisfying (32), due to exactness assumption we obtain
[TABLE]
Let us put Due to (34) we have
[TABLE]
∎
Remark 11**.**
In particular constants may be chosen all equal e.g. when (32) and (33) are satisfied for all with the same constants and
Lemma 12**.**
Let there exist constants such that for all and (20) is satisfied. Then there exists a constant such that
[TABLE]
for all Moreover, if then there exists a constant such that
[TABLE]
for all
Proof.
First we prove the upper bound.
On the one hand, due to (21) it holds
[TABLE]
where we used that the function is increasing.
On the other hand, according to Lemma 8
[TABLE]
It follows that the constant
[TABLE]
does the job.
Now we prove the lower bound.
On the one hand due to (21) we have
[TABLE]
Noticing that is increasing we obtain
[TABLE]
On the other hand we have due to Lemma 8
[TABLE]
It follows that the constant
[TABLE]
satisfies
[TABLE]
∎
Remark 13**.**
Note that the constants and from Lemma 12 fall superexponentially fast in
Corollary 14**.**
Let Let (32) and (33) hold. Furthermore, let there exist constants such that for all and (20) is satisfied. Moreover, assume that . Then there exists a constant such that given there exists such that for every
[TABLE]
with and
Proof.
Let be such that for every
[TABLE]
The existence of is guaranteed by Lemma 12. We put
We would like to express the bound from Lemma 10 in terms of the cardinality . To this end we want to find a function of the form such that for large
[TABLE]
implying
[TABLE]
We rewrite (35) as
[TABLE]
Hence (35) holds if
[TABLE]
and the expression on the right hand side converges from below to as goes to To obtain
[TABLE]
it is sufficient to check that g is increasing on the interval Simple calculations reveal that is increasing on The final step is to notice that
[TABLE]
Putting finishes the proof. ∎
6 Application to Infinite-Dimensional Integration
In Theorem 18 we provide a sharp result on randomized infinite-dimensional integration on weighted reproducing kernel Hilbert spaces that parallels the sharp result on deterministic infinite-dimensional integration stated in [24, Theorem 5.1]. Results from [23] and from [52] in combination with Theorem 7 rigorously establish the sharp randomized result in the special case where the weighted reproducing kernel Hilbert space is based on an anchored univariate kernel. With the help of the embedding tools provided in [24] this result will be extended to general weighted reproducing kernel Hilbert spaces. Before we can state and prove Theorem 18 we first have to introduce the setting, cf. [24].
For basic results about reproducing kernels and the corresponding Hilbert spaces we refer to [1]. We denote the norm on by and the space of constant functions (on a given domain) by ; here denotes the constant kernel that only takes the function value one.
6.1 Assumptions
Henceforth we assume that
- (A1)
is a vector space of real-valued functions on a domain with
and
- (A2)
and are seminorms on , induced by symmetric bilinear forms and , such that and .
Let
[TABLE]
Furthermore, we assume that
- (A3)
is a norm on that turns this space into a reproducing kernel Hilbert space, and there exists a constant such that
[TABLE]
Condition (37) is equivalent to the fact that and are equivalent norms on .
Let us restate Lemma 2.1 from [24]:
Lemma 15**.**
For each there exists a uniquely determined reproducing kernel on such that as vector spaces and
[TABLE]
Moreover, the norms and are equivalent and .
Note that for the special value we have .
The next example illustrates the assumptions and the statement of Lemma 15; for more information and a slight generalization see [24, Example 2.3].
Example 16**.**
Let and . The periodic Sobolev space (also known as Korobov space) is the Hilbert space of all with finite norm
[TABLE]
where is the -th Fourier coefficient of . The functions in are continuous and periodic. It is easily checked that the reproducing kernel of is given by
[TABLE]
Consider the pair of seminorms on given by
[TABLE]
The assumptions (A1), (A2), and (A3) are easily verified. For we have .
Further examples of spaces that satisfy the assumptions (A1), (A2), and (A3) are, for instance, the (non-periodic) Sobolev spaces of smoothness endowed with either the standard norm, the anchored norm or the ANOVA norm, see [24, Example 2.1].
We now want to study weighted tensor product Hilbert spaces of multivariate functions, which implies that we have to consider product weights as introduced in [55]. More precisely, we consider a sequence of positive weights that satisfies
[TABLE]
The decay of the weights is quantified by
[TABLE]
due to (39) we have . For each weight let be the kernel from Lemma 15. With the help of the weights we can define spaces of functions of finitely many variables. For we define the reproducing kernel on by
[TABLE]
The reproducing kernel Hilbert space is the (Hilbert space) tensor product of the spaces .
Now we want to define a space of functions of infinitely many variables. The natural domain for the counterpart of (40) for infinitely many variables is given by
[TABLE]
Let be arbitrary. Due to [24, Lemma 2.2] we have , and in particular . We define the reproducing kernel on by
[TABLE]
For a function we define by
[TABLE]
Due to [24, Lemma 2.3] is a linear isometry from into , and
[TABLE]
6.2 The Integration Problem
To obtain a well-defined integration problem we assume that is a probability measure on implying
[TABLE]
Let and denote the corresponding product measures on and , respectively.
Due to [24, Lemma 3.1] we have for all that
[TABLE]
and the respective embeddings from into are continuous with
[TABLE]
Define the linear functional by
[TABLE]
Note that , since and . Furthermore, , and therefore (45) implies
[TABLE]
This yields the existence of a uniquely determined bounded linear functional
[TABLE]
cf. [24, Lemma 3.2].
Note that every is measurable with respect to the trace of the product -algebra on . (This follows from (44), (45), and the fact that the pointwise limit of measurable functions is again measurable.)
If is measurable, , and , then the bounded linear functional (47) is given by
[TABLE]
For sufficient conditions under which these assumptions are fulfilled we refer to [27].
We consider the integration problem on consisting in the approximation of the functional by randomized algorithms that use function evaluations (i.e., standard information) as admissable information.
6.3 The Unrestricted Subspace Sampling Model
We use the cost model introduced in [40], which we refer to as unrestricted subspace sampling model. It only accounts for the cost of function evaluations. To define the cost of a function evaluation, we fix an anchor and a non-decreasing function
[TABLE]
Put
[TABLE]
For each put
[TABLE]
To simplify the representation, we confine ourselves to non-adaptive randomized linear algorithms of the form
[TABLE]
where the number of knots is fixed and the knots as well as the coefficients are random variables with values in some , , and in , respectively. (We discuss a larger class of algorithms in Remark 21.) The cost of is given by
[TABLE]
In the definition of the cost function an inclusion property has to hold for all . Often this worst case point of view is replaced by an average case (cf., e.g., [42] or [9, 23, 52]). We stress that such a replacement would not affect the cost of the algorithms that we employ to establish our upper bounds for the -th minimal errors; for lower bounds cf. Remark 21(iii).
Let . For let us define the -th minimal error on by
[TABLE]
where in the case the algorithms have to be deterministic, while in the case they are allowed to be randomized. The (polynomial) convergence order of the -th minimal errors of infinite-dimensional integration is given by
[TABLE]
In analogy to our definitions for infinite-dimensional integration, we consider for univariate integration on also linear randomized algorithms of the form (48), except that this time the knots are, of course, random variables with values in . The cost of such an algorithm is simply the number of function evaluations, and -th minimal errors on are given by
[TABLE]
The (polynomial) convergence order of the -th minimal errors of univariate integration is given by
[TABLE]
Remark 17**.**
Let and, accordingly, . Obviously,
[TABLE]
Furthermore, it is easy to see that holds: If is an arbitrary randomized algorithm of the form (48) with , then for every the cost of the deterministic algorithm is at most , implying
[TABLE]
which in turn leads to . Hence we obtain
[TABLE]
6.4 A Sharp Result on Infinite-Dimensional Integration
The next theorem determines the exact polynomial convergence rate of the -th minimal errors of infinte-dimensional integration on weighted reproducing kernel Hilbert spaces.
Theorem 18**.**
Let . If the cost function \$$ satisfies $(\nu)=\Omega(\nu)$(\nu)=O(e^{\sigma\nu})\sigma\in(0,\infty)$, then we have
[TABLE]
Notice that the theorem implies that in the randomized setting infinite-dimensional integration on weighted reproducing kernel Hilbert spaces is (essentially) not harder than the corresponding univariate integration problem (as far as the polynomial convergence rate is concerned) as long as the weights decay fastly enough, i.e., as long as
[TABLE]
Proof.
Let us first consider the case . In the special case where the reproducing kernel is anchored in (i.e., ) and satisfies for all (cf. Lemma 15), the statement of the Theorem follows from [23] and from [52] in combination with Theorem 7, as we will explain below in detail.
For a general reproducing kernel we need to find a suitably associated reproducing kernel anchored in and satisfying for all to employ the embedding machinery from [24] to obtain the desired result (52). To this purpose we consider the bounded linear functional , , where is our fixed anchor. We define a new pair of seminorms on by
[TABLE]
Notice that is induced by the symmetric bilinear form . This new pair of seminorms satisfies obviously assumption (A2) and the norms and are equivalent norms on . Hence turns into a reproducing kernel Hilbert space, and satisfies (37) with since
[TABLE]
Thus the new pair of seminorms satisfies also (A3). Furthermore, if is the reproducing kernel on such that
[TABLE]
and
[TABLE]
then is anchored in and moreover we have as vector spaces, , and
[TABLE]
for all , , implying , see [24, Rem. 2.2]. Since and are equivalent norms on , we obtain . Due to [24, Thm. 2.3] we have
[TABLE]
According to (42) we define by
[TABLE]
Now we consider the integration problem in and may use [23, Subsect. 3.2.1] and [52, Cor. 1] in combination with Theorem 7. Indeed, due to Theorem 7 we may choose linear randomized algorithms with convergence rates arbitrarily close to to obtain via the randomized Smolyak method algorithms that satisfy (26) for (and consequently also [52, Eqn. (10)]). Now [52, Cor. 1] ensures that
[TABLE]
Furthermore, we have due to [23, Eqn. (21)]
[TABLE]
Due to [24, Cor. 5.1] these estimates also hold for .
Let us now consider the case . Due to (51), identity (52) follows directly from the deterministic result [24, Theorem 5.1]. ∎
We now provide two corollaries and to add some remarks.
Theorem 18, which deals with randomized algorithms, and the corresponding deterministic theorem [24, Theorem 5.1] allow immediately to compare the power of deterministic and randomized algorithms.
Corollary 19**.**
Let the assumptions of Theorem 18 hold. For infinite-dimensional integration on randomized algorithms are superior to deterministic algorithms, i.e., , if and only if
[TABLE]
are satisfied.
The next corollary on infinite-dimensional integration on weighted Korobov spaces in the randomized setting parallels [24, Theorem 5.5], which discusses the deterministic setting.
Corollary 20**.**
Let , and let the univariate reproducing kernel be as in (38). Then the weighted Korobov space is an infinite tensor product of the periodic Korobov space of smoothness , see Example 16. If the cost function \$$ satisfies $(\nu)=\Omega(\nu)$(\nu)=O(e^{\sigma\nu})\sigma\in(0,\infty)$, then we have
[TABLE]
Proof.
Since and (see Appendix and Remark 17), Theorem 18 immediately yields the result for and .
Notice that the result for can also be derived from Remark 17 and [24, Theorem 5.5]. ∎
Remark 21**.**
Let us come back to Theorem 18.
- (i)
Algorithms that achieve convergence rates arbitrarily close to are, e.g., multivariate decomposition methods (MDMs) that were introduced in **[40]** (in the deterministic setting) and developed further in **[52]** (in the deterministic and in the randomized setting); originally, these algorithms were called changing dimension algorithms, cf., e.g., **[8, 9, 23, 40, 52]**. MDMs exploit that the anchored function decomposition of an integrand can be efficiently computed; a method for multivariate integration based on the same idea is the dimension-wise integration method proposed in **[30]**. To achieve (nearly) optimal convergence rates, the MDMs may employ as building blocks Smolyak algorithms for multivariate integration that rely on (nearly) optimal algorithms for univariate integration on , cf. **[52, Section 3.3]** and the proof of Theorem 18.
- (ii)
In the special case where and where is an ANOVA-kernel (i.e., satisfies for every ) a version of Theorem 18 was already proved in **[9, Theorem 4.3]**. It was the first result that rigorously showed that MDMs can achieve the optimal order of convergence also on spaces with norms that are not induced by an underlying anchored function space decomposition. It was not derived with the help of function space embeddings, but by an elaborate direct analysis. Apart from addressing only the ANOVA setting, a further drawback of **[9, Theorem 4.3]** is that its assumptions are slightly stronger than the ones made in Theorem 18: It is not sufficient to know the convergence rate of the -th minimal errors of the univariate integration problem, but additionally one has to verify the existence of unbiased randomized algorithms for multivariate integration that satisfy certain variance bounds, see **[9, Assumption 4.1]**. Nevertheless, in many important cases it is well known that such variance bounds hold. Furthermore, one should mention that the analysis in **[9]** ist not restricted to product weights as in this section, but is done for general weights.
Note that the kernel of the Korobov space from Example 16 and Corollary 20 is actually an ANOVA kernel. Hence the identity for in Corollary 20 may also be derived by employing **[9, Theorem 4.3]** after verifying the existence of unbiased algorithms for multivariate integration that satisfy **[9, Assumption 4.1]**.
- (iii)
The upper bound for in (52) relies on the corresponding bound **[23, Eqn. (21)]** for the case where the univariate reproducing kernel is anchored in . Although the definition of the cost function in **[23]** takes the average case and not the worst case point of view and differs therefore from (49), both definitions lead to the same cost for the admissable class of algorithms considered in the unrestricted subspace sampling model in **[23]**. The class contains not only algorithms of the form (48), but also adaptive and non-linear algorithms. In the proof of Theorem 18 we employ the function space embeddings from **[24]**, which allows us to transfer results for linear algorithms from the case of anchored kernels to the general case. Hence we can conclude that the upper bound for in (52) holds also if we admit adaptive linear algorithms of the form (48) for infinite-dimensional as well as for univariate integration, but we do not know whether this is still the case if we admit non-linear algorithms.
We finish this section with some remarks on extensions of our results on infinte-dimensional integration to other settings.
Remark 22**.**
To obtain computational tractability of problems depending on a high or infinite number of variables, it is usually essential to be able to arrange the variables in such a way that their impact decays sufficiently fast. One approach to model the decreasing impact of successive variables is to use weighted function spaces, like the ones we defined and studied in this section, to moderate the influence of groups of variables. This approach goes back to the seminal paper [55]. Another approach is the concept of increasing smoothness with respect to properly ordered variables, see, e.g., [11, 25, 31, 37, 38, 49, 54]. The precise definition of Hilbert spaces of functions depending on infinitely many variables of increasing smoothness can be found in [25, Section 3]. Now [25, Theorem 3.19] shows how to relate these spaces to suitable weighted Hilbert spaces via mutual embeddings, making it therefore easy to transfer our results in the randomized setting, Theorem 18 and Corollary 20, from weighted spaces to spaces with increasing smoothness, cf. [25, Theorem 4.5 and Corollary 4.7] for the corresponding transference results in the deterministic setting.
Instead of applying our result Theorem 7 to the infinite-dimensional integration problem, we may also use it to tackle the infinite-dimensional -approximation problem. Indeed, a sharp result for the latter problem was obtained in [63, Corollary 9] in the deterministic setting for weighted anchored reproducing kernel Hilbert spaces with the help of multivariate decomposition methods based on Smolyak algorithms (cf. [66, Theorem 7]). The analysis relies on explicit cost bounds for deterministic Smolyak algorithms from [64]. In [25, Theorem 4.5] the result is extended to weighted (not necessarily anchored) reproducing kernel Hilbert spaces (relying on the embedding tools from [24]) and to spaces of increasing smoothness.
Now one may use Theorem 7 to establish a corresponding result to [63, Corollary 9] for weighted anchored spaces in the randomized setting and may generalize it to non-anchored weighted spaces and to spaces of increasing smoothness via the embedding results established in [24, 25].
To work out all the details of these generalizations is beyond the scope of the present paper.
7 Appendix
7.1 Randomized Integration Error in Korobov Spaces
For we denote via the space of Korobov functions on a one-dimensional torus with smoothness parameter The space is equipped with the norm
[TABLE]
see Example 16. It is a folklore result that the polynomial convergence order of randomized quadratures on is equal to Since we have not found in the literature a complete proof handling all cases , we decided to provide a proof sketch in this appendix. Similar reasoning for (non-periodic) Sobolev spaces with integer parameter may be found, e.g., in [44, Chapter 2.2]. For the lower error bound we need the following lemma.
Lemma 23**.**
Let There exists a sequence of functions and a constant such that for every
, 2. 2.
** 3. 3.
**
Proof.
Let be a positive infinitely many times differentiable function with and integral equal to Define
[TABLE]
The sequence is the sequence we are looking for. Since all other properties are obvious it is enough to check that for the condition (3) holds. Fix Since (as a bump function) is in the Schwartz space the same holds for its Fourier transform and as a result there exists such that for every
[TABLE]
Since we may neglect it. Simple calculations reveal
[TABLE]
Due to monotonicity of we obtain
[TABLE]
which finishes the proof. ∎
For the upper bound we need a lemma on the approximation of functions from Korobov spaces by trigonometric polynomials.
Lemma 24**.**
Let , and let be the trigonometric polynomial defined by
[TABLE]
where
[TABLE]
are the discrete Fourier coefficients of . Then we have
[TABLE]
and
[TABLE]
The discrete Fourier coefficients , , can be computed via the fast Fourier transform at cost .
Proof.
Proofs of the statements of the lemma can be found in many standard texts on numerical analysis. We follow the course of [32, Sections 52 and 53]. Since we may write as a uniformly convergent Fourier series
[TABLE]
It is well-known (and not difficult to calculate) that if is a trigonometric polynomial of degree interpolating in the nodes then it is given by
[TABLE]
see for example Lemma in [32]. It holds
[TABLE]
The last sum may be bounded in an obvious way by and for the double sum we may use the Cauchy Schwarz inequality to obtain
[TABLE]
The cost analysis of the fast Fourier transform is well known and can, e.g., be found in [32, Section 53]. ∎
Theorem 25**.**
Let It holds
[TABLE]
Proof.
The upper bound on is settled immediately by Lemma 23 in conjunction with Corollary from [41]. Indeed, for choose and
[TABLE]
Then the assumptions of Corollary from [41] are satisfied for implying , and thus establishing the upper bound.
To get the lower bound on consider the following algorithm. Let We first interpolate with the trigonometric polynomial from Lemma 24. Now the integral over is simply the discrete Fourier coefficient . To approximate the integral of we apply a simple Monte Carlo quadrature. To this end let be independent random variables such that is distributed uniformly on . We put
[TABLE]
Now, since is unbiased and are independent, applying Lemma 24 we get
[TABLE]
Since cost of the algorithm is of order , the claim follows. ∎
Acknowledgment
The authors thank Stefan Heinrich for pointing out the reference [34]. Part of the work was done while the authors were visiting the Mathematical Research and Conference Center Bedlewo in Autumn 2016 and the Erwin Schrödinger Institute for Mathematics and Physics (ESI) in Vienna in Autumn 2017. Both authors acknowledge support by the Polish Academy of Sciences; Marcin Wnuk additionally acknowledges support by the German Academic Exchange Service (DAAD).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] N. Aronszajn , Theory of reproducing kernels , Trans. Amer. Math. Soc., 68 (1950), pp. 337–404.
- 2[2] J. Baldeaux and M. Gnewuch , Optimal randomized multilevel algorithms for infinite-dimensional integration on function spaces with ANOVA-type decomposition , SIAM J. Numer. Anal., 52 (2014), pp. 1128–1155.
- 3[3] G. Baszenski and F. J. Delvos , Multivariate Boolean midpoint rules , in Numerical Integration IV, H. Brass and H. Hämmerlin, eds., Basel, 1993, Birkhäuser, pp. 1–11.
- 4[4] H. J. Bungartz and M. Griebel , Sparse grids , Acta Numerica, 13 (2004), pp. 147–269.
- 5[5] F.-J. Delvos , d 𝑑 d -variate Boolean interpolation , J. Approx. Theory, 34 (1982), pp. 99–114.
- 6[6] , Boolean methods for double integration , Math. Comp., 55 (1990), pp. 683–692.
- 7[7] F.-J. Delvos and W. Schempp , Boolean Methods in Interpolation and Approximation , vol. 230 of Pitman Research Notes in Mathematics, Longman, Essex, 1989.
- 8[8] J. Dick and M. Gnewuch , Infinite-dimensional integration in weighted Hilbert spaces: anchored decompositions, optimal deterministic algorithms, and higher order convergence , Found. Comput. Math., 14 (2014), pp. 1027–1077.
