Double-loop randomized quasi-Monte Carlo estimator for nested integration
Arved Bartuska, Andr\'e Gustavo Carlon, Luis Espath, Sebastian Krumscheid, Ra\'ul Tempone

TL;DR
This paper introduces a novel nested randomized quasi-Monte Carlo (rQMC) estimator for nested integrals, providing theoretical error bounds and demonstrating improved efficiency over traditional methods in complex applications.
Contribution
The work develops a new nested rQMC method that simultaneously approximates inner and outer integrals, with rigorous error analysis and practical truncation schemes.
Findings
Derived asymptotic error bounds for bias and variance.
Addressed integrands with infinite variation using Owen's scrambling.
Numerical experiments show improved efficiency over standard nested MC.
Abstract
Nested integration of the form , characterized by an outer integral connected to an inner integral through a nonlinear function , is a challenging problem in various fields, such as engineering and mathematical finance. The available numerical methods for nested integration based on Monte Carlo (MC) methods can be prohibitively expensive owing to the error propagating from the inner to the outer integral. Attempts to enhance the efficiency of these approximations using the quasi-MC (QMC) or randomized QMC (rQMC) method have focused on either the inner or outer integral approximation. This work introduces a novel nested rQMC method that simultaneously addresses the approximation of the inner and outer integrals. The method leverages the unique nested integral structure to offer a more efficient approximation mechanism. As…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical Approximation and Integration · Statistical Methods and Inference · Stochastic processes and financial applications
Double-loop quasi-Monte Carlo estimator for nested integration
Arved Bartuska1, André Gustavo Carlon2, Luis Espath3, Sebastian Krumscheid5, & Raúl Tempone1,2,4
1Department of Mathematics, RWTH Aachen University, Gebäude-1953 1.OG, Pontdriesch 14-16, 161, 52062 Aachen, Germany
2King Abdullah University of Science & Technology (KAUST), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), Thuwal 23955-6900, Saudi Arabia
3School of Mathematical Sciences, University of Nottingham, Nottingham, NG7 2RD, United Kingdom
4Alexander von Humboldt Professor in Mathematics for Uncertainty Quantification, RWTH Aachen University, Germany
5Steinbuch Center for Computing, and Institute for Applied and Numerical Mathematics, Karlsruhe Institute of Technology, Germany
Abstract.
Nested integration arises when a nonlinear function is applied to an integrand, and the result is integrated again, which is common in engineering problems, such as optimal experimental design, where typically neither integral has a closed-form expression. Using the Monte Carlo method to approximate both integrals leads to a double-loop Monte Carlo estimator, which is often prohibitively expensive, as the estimation of the outer integral has bias relative to the variance of the inner integrand. For the case where the inner integrand is only approximately given, additional bias is added to the estimation of the outer integral. Variance reduction methods, such as importance sampling, have been used successfully to make computations more affordable. Furthermore, random samples can be replaced with deterministic low-discrepancy sequences, leading to quasi-Monte Carlo techniques. Randomizing the low-discrepancy sequences simplifies the error analysis of the proposed double-loop quasi-Monte Carlo estimator. To our knowledge, no comprehensive error analysis exists yet for truly nested randomized quasi-Monte Carlo estimation (i.e., for estimators with low-discrepancy sequences for both estimations). We derive asymptotic error bounds and a method to obtain the optimal number of samples for both integral approximations. Then, we demonstrate the computational savings of this approach compared to standard nested (i.e., double-loop) Monte Carlo integration when estimating the expected information gain via two examples from Bayesian optimal experimental design, the latter of which involves an experiment from solid mechanics.
AMS subject classifications: 62F15 65C05 65D30 65D32
Contents
1. Introduction
A nested integral is an integral of a usually nonlinear function of another parametric integral. Integrals of this type appear in many fields (e.g., geology [God18], mathematical finance [Xu20], medical decision-making [Fan22], and optimal experimental design (OED) [Rya03]). The Monte Carlo (MC) method is one of the most popular approximation techniques for integrals, especially high-dimensional ones. For nested integrals, both can be approximated using the MC method, resulting in the double-loop MC (DLMC) estimator. Using MC to approximate a single integral to a specified error tolerance requires a sample size of , whereas using DLMC results in worse complexity, with an overall number of samples of [Rya03]. The two MC estimators in the DLMC estimator are connected through a nonlinear function; thus, the statistical error of the inner MC estimator causes bias in the DLMC estimator. The sample sizes of both MC estimators must be carefully controlled to guarantee that the error in the DLMC estimator is below with a certain confidence, controlling both the statistical error and bias. Improving the DLMC performance is the goal of intense research, with some approaches proposing the use of Laplace approximation [Lon13], importance sampling [Bec18], and multilevel MC (MLMC) [Fan22].
The randomized quasi-MC (RQMC) method [Caf98, Hic98, Nie92, Owe03, Dic10, Lec18] is a promising technique to improve the efficiency of the basic MC method (i.e., the required number of samples to meet a certain tolerance) while maintaining nonintrusive sampling. The RQMC estimator uses deterministic points from a low-discrepancy sequence and randomizes the entire sequence while maintaining a low-discrepancy structure. Randomization allows for the use of either the central limit theorem or Chebyshev inequality to estimate and subsequently bound the error asymptotically [Tuf04, Lec10, Lec18]. Given appropriate regularity assumptions on the integrand, the RQMC method can reduce the order of convergence of the approximation error without introducing additional bias to the MC estimator. Furthermore, if the integrand is sufficiently smooth, this approach can yield an asymptotic rate of 1 as the number of low-discrepancy points increases. The number of evaluations needed by the RQMC estimator to achieve a tolerance is .
Recently, researchers [Gob22] have investigated the optimal error tolerance that can be achieved by RQMC, given a fixed number of samples and a certain confidence level. They demonstrated that combining RQMC with robust estimation improves error tolerances. The RQMC concepts have been applied in the context of OED in [Dro18], but only to the outer integral of a nested integration problem. In [Dro18], the inner integral is approximated using the MC method. A reduced sample standard deviation was observed for several numerical experiments for this scheme compared to using the MC method for both integrals, which was demonstrated for a fixed number of outer and inner samples.
In recent work [Fan22], an RQMC method was used to approximate the outer integral of a nested estimator in medical decision-making. The authors estimated the variance error using the sample variance and bias error by successively doubling the number of inner samples. They compared this RQMC method with an MCLC approach and standard nested (i.e., double loop) MC by specifying a target mean squared error and observing the number of samples needed until this target is reached. Both MLMC and RQMC have similar performance results in practice, depending on the number of parameters and other measures of model complexity.
In [God18], nested integrals are approximated using MLMC techniques. The number of inner samples is increased for each level to reduce the bias induced on the outer approximation by the variance of the inner approximation. The RQMC estimator approximates the inner integral and reduces the variance, requiring a smaller sample size in the inner loop to achieve the error tolerance in the MLMC setting specified in [Gil08, Theorem 3.1]. This outcome is presented both theoretically and via examples. A similar approach is followed in [Xu20]. In this study, a discontinuous inner integrand is approximated using a sigmoid (i.e., smooth function), allowing the RQMC method to be applied.
To further reduce the number of samples required to estimate nested integrals up to a specified error tolerance , we use RQMC for both integrals to build a double-loop quasi-MC (DLQMC) estimator. Indeed, under suitable regularity conditions, the DLQMC method can significantly reduce the required number of overall samples to , compared to for the DLMC method. Moreover, we demonstrate that using RQMC for the outer integral has a greater effect on the overall number of samples than using it for the inner integral, but further savings can still be achieved by applying RQMC to both integrals. We also consider the case where the inner integrand is given only approximately in terms of a computational model, resulting in additional bias for the outer approximation. We provide approximate error bounds using suitable RQMC estimators and verify them on numerical examples.
This paper provides a quick overview of the MC and quasi-MC (QMC) methods, including bounds on the absolute error in Section 2. We introduce the proposed nested RQMC estimator in Section 3. As the main contributions of this work, we derive asymptotic error bounds on the number of inner and outer samples in Propositions 1 and 2, and the optimal setting for this estimator in Proposition 3. Finally, in Section LABEL:sec:numerical.results, we present two examples from Bayesian OED, where nested integrals frequently arise. The first example is an algebraic model introduced in [Hua13], which can be evaluated at a low cost and serves as a toy problem to highlight the effectiveness of the proposed method. The second example demonstrates an application from solid mechanics involving the solution to a partial differential equation (PDE) with favorable regularity properties, demonstrating the practical applicability of the DLQMC estimator.
Greek alphabet
confidence level
increase in convergence rate of QMC beyond MC for the outer integral
work rate for the finite element method
increase in convergence rate of QMC beyond MC for the inner integral
observation noise
central limit theorem error
strain tensor
convergence rate of the discretization error
parameter of interest
dummy variable for the parameter of interest
absolute temperature
splitting parameter between bias and variance error
first Lamé constant
second Lamé constant
design parameter
probability density function of the parameter of interest
randomization
material density
diagonal elements of the error covariance matrix
error covariance matrix and approximate negative inverse Hessian of the log-likelihood
cumulative distribution function of the standard normal distribution
random outcome in the finite element formulation
space of outcomes in the finite element formulation
2. Brief overview of Monte Carlo and quasi-Monte Carlo integration
Before we address the case of nested integration, which is the focus of this work, we first recall the basic concepts for approximating integrals using MC and RQMC for the reader’s convenience.
2.1. Monte Carlo method
We can approximate the integral
[TABLE]
where is square-integrable, and a positive integer, using the MC estimator:
[TABLE]
The MC method uses random points that are independent and identically distributed (iid) samples from the uniform distribution to approximate in (1). Using the central limit theorem (CLT) [Dur19] to analyze the error of the MC estimator, we find that
[TABLE]
where
[TABLE]
as , where , is the inverse cumulative distribution function (cdf) of the standard normal distribution, and is the variance of the integrand for . Alternatively, Chebyshev’s inequality could be used to obtain an error estimate similar to (3) and (4), although with a potentially larger constant .
2.2. Quasi-Monte Carlo method
The MC method always converges to the true value of the integral, even under weak assumptions, but the rate of can be improved for certain integrands. We can instead use the RQMC method, which achieves a better convergence rate by exploiting the regularity properties of the integrand. For a square-integrable function , the RQMC estimator to approximate the integral (1) is given by
[TABLE]
where are chosen from a sequence of points consisting of a deterministic component and a random component ,
[TABLE]
In particular, choosing from a low-discrepancy sequence [Nie92, Hic98] results in improved convergence for smooth integrands. One example that can achieve this is lattice rules [Hic98], and is a random shift. This example provides points of the shape
[TABLE]
where denotes the componentwise fractional part operator. Every one-dimensional projection of this point set is injective. Another common method for selecting a suitable low-discrepancy sequence is to choose the deterministic points from a digital sequence [Nie92, Owe03] and set to be random permutations of the digits of . By splitting into equally spaced subintervals in each dimension, each subinterval contains the same number of points. We use a digital sequence called Sobol sequence [Sob67] throughout this work, as it has performed best on numerical tests. A number of points , such that , must be used to achieve the best results for this sequence type. The difference between the estimators (2) and (5) lies in the points used to evaluate the function to be integrated.
Traditional error estimates for numerical integration based on deterministic low-discrepancy sequences (i.e., in (6)) use the Koksma–Hlawka inequality [Hla61, Nie92] to bound the error by a product of suitable measures for the low-discrepancy sequence and integrand, respectively. This approach can be problematic in practice because, in most instances, sharp estimates of these quantities are exceedingly difficult to obtain, and the resulting quadrature error bound is far from optimal [Lec18].
To obtain a probabilistic estimate using the CLT, we must use several iid randomizations , in (6). Then, we find that
[TABLE]
approximately holds for
[TABLE]
for , where the variance of the RQMC estimator can be approximated as follows:
[TABLE]
and
[TABLE]
For a fixed , the error (9) decreases at the rate for [Gob22, Loh03, Dic13, Lec18], where depends on the dimension and may depend on the regularity of the integrand . For , this provides the usual MC rate of 1/2. For certain functions with desirable properties such as smoothness and boundedness, more precise statements are possible [Gob22, Owe08]. The CLT-based error estimate (8) only holds asymptotically as . It can still be used in practice to obtain a confidence interval of the error (9); however, keeping fixed and letting is sometimes problematic, as [Tuf04, Lec10] noted. Specifically, the convergence of the distribution of the estimator (5) to a normal distribution cannot be guaranteed. Chebyshev’s inequality can also justify the convergence rate of . We employ the CLT for the error analysis and demonstrate that the derived error bounds hold at the specified confidence level for a simple example. However, we remark that it is straightforward to adapt the analysis to the Chebyshev bounds instead.
Remark 1** (Integration over general domains).**
The (RQ)-MC method is commonly defined for integration over the unit cube and uniform random variables. For integrals over general domains (e.g., normal random variables), the corresponding inverse cdf can be applied to maintain the general shape of the estimators (2) and (5).
3. Nested integration
After discussing the basics of the RQMC estimator, we address the focus of this work. This section establishes the DLQMC estimator for nested integration problems, derives asymptotic error bounds in the number of samples, and analyzes the optimal work required for this estimator to meet a tolerance goal.
Definition 1** (Nested integral).**
We define a nested integral as
[TABLE]
where the square-integrable function is nonlinear and twice differentiable with respect to . In addition, is square-integrable and defines a nonlinear relation between and , where are positive integers.
Example 1** (Nested integral).**
The integral
[TABLE]
where is a nonlinear function, is of the nested type and is typically not solvable in closed form, motivating the use of numerical integration techniques to approximate .
A standard method to approximate a nested integral (12) is via the DLMC estimator [Rya03], defined as
[TABLE]
where the points , , are sampled iid from , and , , , are sampled iid from . The standard MC estimator (2) for a single integral is unbiased and has a variance that decreases with the number of samples, which holds for the inner MC estimator in (14), where the variance decreases with the number of inner samples . The outer MC estimator in (14) has a variance that decreases with but also has a bias relative to the size of the variance of the inner integral estimate. Thus, we typically require many inner and outer samples to keep the bias and variance of this estimator in check, significantly limiting its practical usefulness, particularly for computationally demanding problems.
Directly evaluating the function is often not possible. For example, if evaluating requires solving a PDE, we may only have access to a finite element method (FEM) approximation with discretization parameter . As asymptotically, the convergence order of is given by
[TABLE]
where is the -convergence rate and is a constant. The work of evaluating is assumed to be , for some .
Unless stated otherwise, we assume that a discretized is used in the numerical estimators and omit the subscript for concision. The DLMC estimator has a bias with the following upper bound:
[TABLE]
where is a constant related to the variance of the inner MC estimation in (14), and might be different from the one introduced in (15). The DLMC estimator has a variance with the following upper bound:
[TABLE]
where are constants [Bec18]. The optimal work of the DLMC estimator for a specified error tolerance is given by
[TABLE]
Proofs for the specific case of approximating the expected information gain (EIG) in Bayesian OED are presented in [Bec18], but adapting them to the general case is straightforward.
To obtain smaller error bounds and, subsequently, smaller optimal work, we replaced both MC approximations in (14) with RQMC approximations and arrived at the DLQMC estimator, which we define below.
Definition 2** (DLQMC estimator).**
The DLQMC estimator of a nested integral (12) is defined as follows:
[TABLE]
where the square-integrable function is nonlinear, and is square-integrable. The sample points have the following shape (see (6)):
[TABLE]
where and .
One instance of the estimator (19) requires iid randomizations. The total number of function evaluations required to obtain a probabilistic error estimate using iid randomizations , , is . The difference between the estimators (14) and (19) lies in the points used to evaluate the function to be integrated.
Next, we analyze the error of DLQMC. First, we split the error into the bias and statistical errors, respectively, and estimate each individually. Specifically, these terms are
[TABLE]
The CLT allows us to replace the statistical error with the variance .
Proposition 1** (Bias of the DLQMC estimator).**
The DLQMC estimator for (12) has a bias with the following upper bound:
[TABLE]
where , and is the weak rate defined in (16), which is induced by the approximate function . The parameter depends on the dimension and may depend on the smoothness of .
Proof.
We start by introducing , the DLQMC estimator for evaluated exactly, and further split the bias into
[TABLE]
For the discretization bias, we have
[TABLE]
where is the weak rate defined in (16).
For the bias from the inner sampling, we first define the following:
[TABLE]
Next, we use the second-order Taylor expansion of for a random variable around ,
[TABLE]
to Taylor expand around (with a slight abuse of notation)
[TABLE]
Taking the expectation conditioned on , we obtain
[TABLE]
where the higher-order term can be derived using the Bienaymé formula, resulting in
[TABLE]
∎
The parameter can be estimated numerically along with for practical applications. In addition, the constant in (22) might be different from that in (16).
Proposition 2** (Variance of the DLQMC estimator).**
The DLQMC estimator for (12) has a variance with the following upper bound:
[TABLE]
where are constants and depends on the dimension and may depend on the smoothness of , depends on the dimension and may depend on the smoothness of , and depends on the dimension and may depend on the smoothness of the higher-order terms.
Proof.
By the law of total variance, we have
[TABLE]
Using (28) for the first term yields
[TABLE]
Only using (3) up to the first order for the second term in (31) results in
[TABLE]
∎
We expect , , and to be different from each other in general, as the approximated integrands might have different smoothness properties and dimensions.
With the error bounds established in Propositions (1) and (2), we can analyze the work required for the DLQMC estimator in terms of the number of samples and approximated model. We assume that the work for each model evaluation is .
Proposition 3** (Optimal work of the DLQMC estimator).**
The total work of the optimized DLQMC estimator for a specified error tolerance is given by
[TABLE]
as , where depends on the dimension and may depend on the smoothness of , and depends on the dimension and may depend on the smoothness of .
Proof.
The computational work of the DLQMC estimator is
[TABLE]
where is proportional to the work required for evaluating for discretization parameter . The CLT allows us to approximately bound the statistical error (30) above in probability. We obtain the optimal setting by solving
[TABLE]
for , the inverse cdf of the standard normal at confidence level , and (the allotted tolerance). In addition, is an error-splitting parameter.
We can solve this problem using Lagrange multipliers to derive the optimal and in terms of and . The equation for is cubic and has a closed-form solution, but it is unwieldy to state explicitly here. The last remaining equation for unfortunately has no closed-form solution for , so we must solve a simplified version and demonstrate that the resulting solution converges to the true solution as . The optimal values are given by
[TABLE]
[TABLE]
The optimal is given by the real root of
[TABLE]
Finally, the optimal is given by the solution to
[TABLE]
It immediately follows from (36) that such an exists. However, it is only given in closed form for the cases .
Assuming that the optimal discretization parameter is independent of the sampling method (MC or RQMC), we split the bias constraint (22) as follows:
[TABLE]
to derive asymptotic rates in terms of the tolerance . Splitting more elaborately might improve constant terms, but this is only performed for analytical purposes. Together with (38) and (37), this implies that
[TABLE]
and
[TABLE]
Next, we demonstrate that . From the variance constraint (30), we obtain
[TABLE]
The term on the right-hand side of (45) is constant in . If we ignore the second term on the left-hand side, we can solve the equation (45) and obtain the approximate solution:
[TABLE]
To determine that this approximation converges to the true solution, we check that the ignored term in (45) approaches 0 as as we insert (46). For this term, we have
[TABLE]
where the exponent of is between 0 and 1 because . Thus, this term approaches 0 as . In contrast, if we ignored the first term in (45), the approximate solution would be . Inserting this solution, the first term in (45) has an exponent of between negative infinity and 0; thus, it approaches negative infinity as . ∎
The constants , , and can be estimated using random shifts. Rather than using the approximate solution (46), we can also solve the equation (40) numerically.
Remark 2** (Borderline settings).**
In the worst case, , we obtain the optimal MC work , as presented in (18). In the best case, , we obtain the optimal DLQMC work , a reduction of order 3/2 compared to the DLMC method.
