Quasi-Monte Carlo integration for twice differentiable functions over a triangle
Takashi Goda, Kosuke Suzuki, Takehito Yoshiki

TL;DR
This paper develops a quasi-Monte Carlo method for integrating twice differentiable functions over a triangle, achieving near-optimal error bounds with explicit point sequence constructions.
Contribution
It provides an explicit construction of point sequences for QMC integration over a triangle with near-optimal error bounds, extending previous methods.
Findings
Achieves integration error of order N^{-1} (log N)^3
Includes a construction by Basu and Owen (2015) as a special case
Proves the bounds are nearly optimal given known lower bounds
Abstract
We study quasi-Monte Carlo integration for twice differentiable functions defined over a triangle. We provide an explicit construction of infinite sequences of points including one by Basu and Owen (2015) as a special case, which achieves the integration error of order for any . Since a lower bound of order on the integration error holds for any linear quadrature rule, the upper bound we obtain is best possible apart from the factor. The major ingredient in our proof of the upper bound is the dyadic Walsh analysis of twice differentiable functions over a triangle under a suitable recursive partitioning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Quasi-Monte Carlo integration for twice differentiable functions over a triangle††thanks:
The research of T. Goda was supported by JSPS Grant-in-Aid for Young Scientists No.15K20964. The research of K. Suzuki and T. Yoshiki was partially supported under the Australian Research Councils Discovery Projects funding scheme (project number DP150101770). The research of K. Suzuki was partially supported under CREST, JST.
Takashi Goda, Kosuke Suzuki, Takehito Yoshiki Graduate School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan ([email protected])Graduate School of Science, Hiroshima University, 1-3-1 Kagamiyama, Higashihiroshima 739-8526, Japan ([email protected])Research Support Department, University Management Division, Osaka City University, 3-3-138 Sugimoto, Sumiyoshi-ku, Osaka-shi, 558-8585 Japan. ([email protected])
Abstract
We study quasi-Monte Carlo integration for twice differentiable functions defined over a triangle. We provide an explicit construction of infinite sequences of points including one by Basu and Owen (2015) as a special case, which achieves the integration error of order for any . Since a lower bound of order on the integration error holds for any linear quadrature rule, the upper bound we obtain is best possible apart from the factor. The major ingredient in our proof of the upper bound is the dyadic Walsh analysis of twice differentiable functions over a triangle under a suitable recursive partitioning.
Keywords: Quasi-Monte Carlo, digital nets and sequences, numerical integration on triangle, dyadic Walsh analysis
MSC classifications: Primary, 42C10, 65D32; Secondary, 41A55, 65C05, 65D30
1 Introduction
In this paper we study numerical integration of twice differentiable functions defined over a triangle . For an integrable function , we denote the true normalized integral of by
[TABLE]
where denotes the Lebesgue measure of . As an approximation of , we consider a linear algorithm of the form
[TABLE]
for an -element point set and a set of real-valued weights . In particular, a quasi-Monte Carlo (QMC) integration is an equal-weight quadrature rule where the weights sum up to 1, i.e., a linear algorithm with the special choice for all . Therefore, is simply approximated by
[TABLE]
If an infinite sequence of points is given, the first elements of are used as .
We define the norm in by
[TABLE]
and study the worst-case absolute error over the unit ball of , i.e.,
[TABLE]
Thus an obvious goal in this context is to construct a good point set or sequence in such that the quantity is small either for some or uniformly for all .
The theory of QMC integration has been developed in depth with the particular focus on approximating the integral of functions defined over the unit cube , see for instance [6, 9, 12]. In fact, not much attention has been paid to QMC integration over non-cubical domains until recently. We have to point out, however, that many practical problems are not necessarily given by quadrature over the unit cube. So far, the most standard approach to QMC integration over a non-cubical domain is to find a uniformity-preserving transformation and then to approximate the normalized integral of by
[TABLE]
for . In the literature, Fang and Wang [7] introduced several transformations from the unit cube to the ball, sphere, and simplex. Pillards and Cools [10] studied 5 different transformations from the unit cube to the simplex. More recently, Basu and Owen [3] gave sufficient conditions on so that is either of bounded variation or satisfies additional smoothness conditions.
Instead of applying a uniformity-preserving transformation, more direct and explicit constructions of point sets and sequences in a triangular domain have been introduced recently by Basu and Owen [2]. One is based on the van der Corput sequence in base 4 in conjunction with a recursive partitioning of . The other is given by a rotation of an integer lattice through an angle whose tangent is badly approximable. A discrepancy measure derived in [4] was employed as a quality criterion of these constructions, and it was shown that the latter one attains a lower discrepancy. Nonetheless, the former one is of practical importance since it is extensible and can be randomized.
In this paper, motivated by the first construction of Basu and Owen, we study QMC integration for smooth functions in . In particular, we give an explicit construction of infinite sequences of points including one by Basu and Owen as a special case, and prove that our quadrature rule achieves the worst-case error of order for any in . The main result of this paper can be summarized as follows:
Theorem 1**.**
For a triangle , we can explicitly construct an infinite sequence of points in for which there exists a constant such that
[TABLE]
for all , and in particular,
[TABLE]
for all .
Roughly speaking, our approach for the proof of Theorem 1 is to exploit the decay of the Walsh coefficients for under a suitable recursive partitioning of . By following the essentially same argument using bump functions as in [1], see also [5, Section 2.7], we see that a lower bound of order on the worst-case error holds for any linear algorithm in . Namely, there exists a constant such that
[TABLE]
holds for any choice of and . Thus the upper bound we obtain is best possible apart from the factor.
The rest of this paper is organized as follows. In the next section we present an explicit construction of infinite sequences of points in . We prove an upper bound on the worst-case error for our quadrature rule in Section 3, whereas the result on the decay of the Walsh coefficients for , which is necessary for the proof of an upper bound, is shown later in Section 4.
Throughout this paper we use the following notations. Let be the set of integers, the set of positive integers, and . We denote the two-element field by , which is identified with the set equipped with addition and multiplication modulo 2. The addition operation in is denoted by , and in case of vectors or matrices over , is applied componentwise. Further we denote a triangular domain with vertices by
[TABLE]
and the diameter of a set by . Without loss of generality, we assume that the center of a triangle is located at the origin in , i.e., if the center of is not located at the origin, it suffices to shift the whole domain .
2 Explicit construction
2.1 Recursive partitioning
In a similar way to [2, Section 3], here we introduce a recursive partitioning of a triangle . We first partition the triangle into 4 congruent subtriangles, to each of which a pair is assigned with in the center. Then we partition each subtriangle into 4 congruent sub-subtriangles, to each of which a pair is assigned again with in the center. Hence every sub-subtriangle can be now identified with a set of pairs and . This is illustrated in Figure 1. It is obvious that this recursive partitioning of the triangle defines the mapping from to , which is surjective but not injective. Moreover, for a matrix with for all , the first rows of determines which of congruent subregions the matrix is mapped within, and from the condition that the pair is always assigned in the center, we see that the matrix is mapped to the center of the corresponding subregion.
We now describe our recursive partitioning more precisely. The subtriangle of for a pair is defined by
[TABLE]
Then the sub-subtriangle for a set of pairs and is defined by
[TABLE]
In this way, the subregion for a matrix with some is defined recursively by
[TABLE]
For simplicity of notation, as long as , we write
[TABLE]
Moreover we define the mapping by
[TABLE]
and, again for simplicity of notation, as long as we write
[TABLE]
Regarding the map we have the following lemma. Since the result can be easily proved by induction on , we omit the proof.
Lemma 1**.**
For a matrix , we define by and
[TABLE]
for . Let with and . Then the following holds true:
For , we have
[TABLE] 2. 2.
For , define by
[TABLE]
Then we have . In particular
[TABLE]
2.2 Generating infinite sequences of points in a triangle
We describe how to generate an infinite sequence of points in a triangle . For this purpose, we first introduce the definition of digital nets over for the two-dimensional case.
Definition 1**.**
For with , let . For an integer , denote the dyadic expansion of by . Define the matrix by
[TABLE]
for . Then we call the subset a (two-dimensional) digital net over with generating matrices .
Remark 1**.**
By using the map defined by
[TABLE]
for , digital nets over are usually defined as point sets in by , where is applied columnwise. Here we see that the integer denotes the precision of points. In this paper, it is more reasonable to define digital nets over as subsets in instead of point sets in .
The above definition can be extended to digital sequences over .
Definition 2**.**
Let . For each , we assume for all sufficiently large . For , denote the dyadic expansion of by . Define the matrix by
[TABLE]
for . Then we call the infinite sequence a (two-dimensional) digital sequence over with generating matrices .
Remark 2**.**
It follows from the assumption for all sufficiently large that, for each , there exists a unique such that for both . Furthermore, for , the first elements of can be regarded as a digital net over with generating matrices for some , where we denote by the left upper sub-matrix of .
Now we are ready to present how to generate an infinite sequence of points in a triangle .
Definition 3**.**
Let be a digital sequence over with generating matrices . Then an infinite sequence of points in is given by
[TABLE]
where the function is given as in Remark 2.
It is clear from this definition that our infinite sequence of points is determined by generating matrices . Thus we need some quality measure for generating matrices to make an explicit construction of possible, which is discussed in the next subsection.
2.3 Dual net and a new weight function
We first recall the notion of dual net.
Definition 4**.**
For with , let a digital net over with generating matrices . The dual net of is defined by
[TABLE]
Remark 3**.**
Let be a digital sequence over . As mentioned in Remark 2, the first elements of are a digital net over with generating matrices for some . Thus the above definition of the dual net still applies to such an initial finite segment of .
The following weight function, introduced in [8] and [11], is well known.
Definition 5**.**
For , where all but only a finite number of are 0, we define
[TABLE]
and . In case of a matrix with , where all but only a finite number of elements in are 0, we define
[TABLE]
If is an element in for some , by considering an injection
[TABLE]
we use the same symbol to define the weight function for such . A similar abuse of notation is also done in case of a matrix for finite .
For a digital net , we define the so-called minimum weight by
[TABLE]
which has been often used as a quality measure of generating matrices for QMC integration over the unit cube. If a digital net satisfies
[TABLE]
for some , we call a digital -nets over . Furthermore, for a digital sequence , if there exists a non-negative integer such that the first elements of are a digital -net over for any , we call a digital -sequence over .
Remark 4**.**
Any digital net satisfies the above inequality for . In practice, we prefer a larger value of and thus equivalently a smaller value of , and is best possible. We refer to **[6, 9]** for several explicit constructions of digital nets and sequences with small -value. 2. 2.
In what follows, we restrict ourselves to digital -sequences with upper triangular generating matrices which satisfy for . Explicit constructions of digital sequences by Sobol **[13]** and Tezuka **[14]** hold this property. Besides, by allowing the situation , the first elements of a digital -sequence can be regarded as a digital -net for any .
Now we introduce a new weight function which suits our purpose.
Definition 6**.**
Let . For a matrix with , where all but only a finite number of elements in are 0 if , we define
[TABLE]
We can define the weight function equivalently as follows: For a matrix , where all but only a finite number of are 0 if , define
[TABLE]
and .
Similarly to , we define the minimum weight of a digital net by
[TABLE]
Here we prefer a digital net with a large value of . In the following lemma, we show that a digital -net with small is exactly what we want.
Lemma 2**.**
Let be a digital -net over . Then we have
[TABLE]
Moreover let be a digital -sequence over . Then for any , we have
[TABLE]
Proof.
Let with , where all but only a finite number of elements in are 0. From the definitions of and , we have
[TABLE]
which gives
[TABLE]
This proves the first statement. The second statement directly follows from the definition of a digital -sequence. ∎
Hence our explicit construction of an infinite sequence of points in is to use a digital -sequence over with upper triangular generating matrices which is mapped to according to Definition 3. In the next section, we prove that such an infinite sequence of points in achieves the almost optimal order of convergence for smooth functions in .
Before going into the proof of an error bound, we provide another explicit construction inspired by the first construction due to Basu and Owen [2]. Let be given by
[TABLE]
For with finite dyadic expansion , we have
[TABLE]
Thus for even , it is obvious that the first elements of generated by these matrices are given by
[TABLE]
Considering the image of the map , we can easily check that the point set in obtained in this way is the same as that of Basu and Owen. This implies that our construction scheme includes their explicit construction as a special case.
Moreover it is easy to show that the first elements of are actually a digital -net over . It can be seen from Lemma 2 that the minimum weight for is bounded below by , which can be improved as follows. Since the result follows from direct calculation, we omit the proof.
Lemma 3**.**
Let be given by (2). For , let be a digital net over with generating matrices . Then we have
[TABLE]
3 Upper bound
Here we prove an upper bound on the worst-case error for our quadrature rule in by using the result later shown in Section 4.
3.1 Discretized function on a triangle
Definition 7**.**
For an integrable function and , we define the -th discretized function by
[TABLE]
for .
Obviously we have
[TABLE]
Moreover it can be shown that approximates well.
Lemma 4**.**
Let and . For any , we have
[TABLE]
Proof.
From Definition 7 we have
[TABLE]
Let us fix and consider the line segment . Since is a triangle, and thus is convex, the set is included in . Hence we have
[TABLE]
which completes the proof. ∎
3.2 Walsh functions and coefficients
In order to exploit the smoothness of functions in , we shall conduct a discrete Walsh-Fourier analysis of the discretized function defined on later in Section 4. Right now we just introduce the definition of Walsh functions and briefly review some basic facts so as to make the proof of the main result in the next subsection accessible.
First the Walsh functions are defined as follows.
Definition 8**.**
Let be fixed. For a matrix , the -th Walsh function is defined by
[TABLE]
for .
For a function , we have the following Walsh expansion:
[TABLE]
where denotes the -th Walsh coefficient defined by
[TABLE]
The following character property holds between a digital net over and Walsh functions, see for instance [5, Lemmas 4.2 & 4.5] for the proof.
Lemma 5**.**
Let be a digital net over . Then we have
[TABLE]
Using this lemma, for any we have
[TABLE]
where the second equality stems from the fact
[TABLE]
for any .
3.3 Proof of the main result
In order to prove Theorem 1, it suffices to show:
Theorem 2**.**
Let be either a digital -sequence with upper triangular generating matrices or a digital sequence with generating matrices given by (2), and let be constructed according to Definition 3. Denote the first elements of by . For any , the following holds true:
For all , we have
[TABLE] 2. 2.
For all , we have
[TABLE]
Proof.
We only prove the case where is a digital -sequence with upper triangular generating matrices. The case where is a digital sequence with generating matrices given by (2) can be shown in exactly the same way.
We denote the dyadic expansion of by , where . We split the first elements of , denoted by , into non-overlapping subsets
[TABLE]
It is the well-known property of a digital sequence that each subset is given by digitally shifting a digital net , see for instance [6, Proof of Theorem 4.84]. That is, there exists such that
[TABLE]
Let . Due to the property of upper triangular matrices, all of the elements in and can have at most the first rows different from . Thus we obtain
[TABLE]
For each , we write
[TABLE]
which is a digital -net with generating matrices , see the second item of Remark 4. By using (3), (4), and Lemma 4 we have
[TABLE]
Let . Applying the result obtained in Lemma 10, we have
[TABLE]
The sum on the right-hand side is bounded by
[TABLE]
where we write , which is a linear subspace of . The following obvious inclusions
[TABLE]
induces the injective map
[TABLE]
Therefore we have
[TABLE]
It follows from the fact for that
[TABLE]
and thus for . Now we obtain
[TABLE]
Using this bound and Lemma 2, the summand in (5) can be bounded by
[TABLE]
Plugging this bound into (5), we have
[TABLE]
Hence the result for the first item follows. The second item follows easily by considering the case , for which we have and . ∎
4 Walsh analysis on a triangle
In this section, we give a bound on the Walsh coefficient for the -th discretized function for . We first present a formula between the Walsh coefficient and the so-called dyadic differences in Lemma 8. Here the dyadic differences are defined in Definition 10. Note that the concept of the dyadic differences is originally introduced in [15], while we need to change the definition slightly so as to suit our purpose, i.e, QMC integration over a triangular domain. Converting the dyadic differences into the usual derivatives, we have a bound on the Walsh coefficient for in Lemma 10.
4.1 Definitions and basic results
Here we introduce some more definitions and show some basic but necessary results related to them. For , and with , we define the operation
[TABLE]
Moreover, for , we define
[TABLE]
Regarding the group operation , we have the following.
Lemma 6**.**
Let , , and .
For ,
[TABLE] 2. 2.
For ,
[TABLE]
Proof.
Let us consider the first item. It follows from Lemma 1 that
[TABLE]
where we write . For with , we have , which implies . Thus, by the definition of , we have
[TABLE]
and for . Using these facts, we have
[TABLE]
Hence we have the result.
Let us move on to the second item. For with , we have , which implies . Thus, by the definition of , we have (7) and
[TABLE]
Using these equalities we have
[TABLE]
Hence we have the result. ∎
Let , and . By abuse of notation, we define the map also for a real vector by
[TABLE]
As long as there is no risk of confusion, we simply denote it as . By comparing this definition with the results of Lemma 6, it is straightforward to see that the image of the restriction of the map to is . By the definition of , this map is isometric, and thus, is a function. This map has the following relationship with the group operator .
Lemma 7**.**
For any , the following holds true:
For , , and , we have
[TABLE] 2. 2.
For , , and , we have
[TABLE] 3. 3.
For and , we have
[TABLE]
and
[TABLE]
Proof.
Let us consider the first item. Since is isometric, using the change of variables , we have . Thus the result follows.
The second item follows from applying the first item twice.
Finally let us consider the third item. Since , we also have . As above, we have for . Since the subregion with does not depend on and is identical to , we have
[TABLE]
Since we now know that for , it follows that
[TABLE]
which completes the proof. ∎
Furthermore, we need the following maps all from to :
[TABLE]
For , let and . It is then trivial to see and for any .
Now fix with . We divide the set into some mutually exclusive subsets:
[TABLE]
for . The following properties obviously hold:
[TABLE]
4.2 Bounds on Walsh coefficients
Using the division of by introduced in the previous section, we consider separating the -th Walsh coefficient of into the following values .
Definition 9**.**
Let . For and , we define the Walsh coefficient of on the subset :
[TABLE]
Note that it is obvious to see
[TABLE]
Thus, in order to obtain an upper bound on , it suffices to show an upper bound on each . For this goal, we first introduce the concept of dyadic differences.
Definition 10**.**
For a function , the -th dyadic difference for is defined by
[TABLE]
for .
We now show the following key equalities on and dyadic differences.
Lemma 8**.**
Let be a function. For , the following holds true:
For , we have
[TABLE] 2. 2.
For , we have
[TABLE] 3. 3.
For , we have
[TABLE]
Proof.
Let us consider the first and second items. Denote
[TABLE]
We first show that we have
[TABLE]
For with , the -th components of and are same, and the -th component of is for . Thus we only need to show that belongs to the -th component of . For , it obviously holds since the -th component of is . For , the -th component of is , and thus from the property
[TABLE]
we see that belongs to the -th component of .
Using the equality (9) and from the property of Walsh functions, we have
[TABLE]
Then it follows that
[TABLE]
which completes the proof of the second item by putting . Let us consider the case . From the definition of , we have and thus . Hence we have the result for the first item.
From the result for the first item, to which the result for the second item is applied with replaced by , we have
[TABLE]
Hence the result for the third item follows. ∎
Converting the dyadic differences to the usual derivatives, we shall get a bound on where denotes the -th discretized function of . As a preparation we need the following lemma.
Lemma 9**.**
Let with . For , we have
[TABLE]
and
[TABLE]
Proof.
Since is convex, we have . Following a similar argument as in the proof of Lemma 4, we can get the first inequality of this lemma. Thus let us focus on the second one. Again in a similar way as in the proof of Lemma 4, we have for ,
[TABLE]
The summand in the last expression for a given is bounded by
[TABLE]
from which the second inequality of this lemma obviously follows. ∎
Eventually we arrive at showing upper bounds on and .
Lemma 10**.**
Let be a function and be its -th discretized function. For any , we have
[TABLE]
Proof.
First we recall that holds since . Thus for any we have
[TABLE]
We use this equality without any notice.
We now show a bound on . From the first item of Lemma 8 and the triangle inequality, we have
[TABLE]
From the obvious fact and the first item of Lemma 7, we have
[TABLE]
where we use the result in Lemma 9 with in the second inequality, and then the third item of Lemma 7 in the last inequality. Thus we obtain a bound on :
[TABLE]
Next we show a bound on for . From the third item of Lemma 8 and the triangle inequality, we have
[TABLE]
From the second item of Lemma 7, we have
[TABLE]
In what follows, we continue with further arguments separately for the cases and .
Let us consider the case . In this case we have . Thus we obtain
[TABLE]
It is easy to see by definition that for . Since implies
[TABLE]
we have . Further it follows from the third item of Lemma 7 that . Thus we obtain
[TABLE]
and
[TABLE]
Comparing these equalities gives
[TABLE]
from which we see that form a parallelogram. By using the result in Lemma 9 with and and then the third item of Lemma 7 again, we have
[TABLE]
Thus we obtain a bound on :
[TABLE]
Let us move onto the case . In this case we have . Thus we obtain
[TABLE]
Since , we see that
[TABLE]
Thus, from (8) we obtain
[TABLE]
and
[TABLE]
Comparing these equalities gives
[TABLE]
from which we see that form a parallelogram. (Note that the points and form not the diagonal but the edge of the parallelogram unlike in the case of .) By using the result in Lemma 9 with and and then the third item of Lemma 7 again together with the triangle inequality, we have
[TABLE]
Thus we obtain a bound on :
[TABLE]
Finally a bound on is given by
[TABLE]
Hence we complete the proof of this lemma. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] N. S. Bakhvalov, Approximate computation of multiple integrals (in Russian), Vestnik Moskov. Univ. Ser. Mat. Meh. Astr. Fiz. Him. 4 (1959), 3–18.
- 2[2] K. Basu, A. B. Owen, Low discrepancy constructions in the triangle, SIAM J. Numer. Anal. 53 (2015), 743–761.
- 3[3] K. Basu, A. B. Owen, Transformations and Hardy-Krause variation, SIAM J. Numer. Anal. 54 (2016), 1946–1966.
- 4[4] L. Brandolini, L. Colzani, G. Gigante, G. Travaglini, A Koksma–Hlawka inequality for simplices, in: Trends in Harmonic Analysis, Springer, New York, 2013, pp. 33–46.
- 5[5] J. Dick, A. Hinrichs and F. Pillichshammer, Proof techniques in quasi-Monte Carlo theory, J. Complexity 31 (2015), 327–371.
- 6[6] J. Dick, F. Pillichshammer, Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte Carlo Integration , Cambridge University Press, Cambridge, 2010.
- 7[7] K.-T. Fang, Y. Wang, Number-Theoretic Methods in Statistics , Chapman & Hall, London, 1994.
- 8[8] H. Niederreiter, Low-discrepancy point sets, Monatsh. Math., 102 (1986), 155–167.
