An Upper Bound of the Minimal Dispersion via Delta Covers
Daniel Rudolf

TL;DR
This paper establishes an upper bound on the largest empty test set volume for point sets in high-dimensional cubes, using delta covers, with specific bounds for axis-parallel boxes and toroidal cases.
Contribution
It introduces a new upper bound on minimal dispersion based on delta covers, applicable to various geometric test sets in high dimensions.
Findings
Bound of (log |Γ_δ|)/n + δ for minimal dispersion
Specific bounds for axis-parallel boxes: (4d/n) log(9n/d)
Specific bounds for torus: (4d/n) log(2n)
Abstract
For a point set of elements in the -dimensional unit cube and a class of test sets we are interested in the largest volume of a test set which does not contain any point. For all natural numbers , and under the assumption of a -cover with cardinality we prove that there is a point set, such that the largest volume of such a test set without any point is bounded by . For axis-parallel boxes on the unit cube this leads to a volume of at most and on the torus to .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPoint processes and geometric inequalities · Computational Geometry and Mesh Generation · Limits and Structures in Graph Theory
11institutetext: Daniel Rudolf
Institut für Mathematische Stochastik, University of Goettingen, Goldschmidtstraße 7, 37077 Göttingen, Germany
11email: [email protected]
An Upper Bound of the Minimal Dispersion via Delta Covers
Daniel Rudolf
Abstract
For a point set of elements in the -dimensional unit cube and a class of test sets we are interested in the largest volume of a test set which does not contain any point. For all natural numbers , and under the assumption of the existence of a -cover with cardinality we prove that there is a point set, such that the largest volume of such a test set without any point is bounded above by . For axis-parallel boxes on the unit cube this leads to a volume of at most and on the torus to .
Dedicated to Ian H. Sloan on the occasion of his 80th birthday.
1 Introduction and Main Results
For a point set of elements in the unit cube and for a set of measurable subsets of the quantity of interest is the dispersion, given by
[TABLE]
Here denotes the -dimensional Lebesgue measure and is called set of test sets. The dispersion measures the size of the largest hole which does not contain any point of . The shape of the hole is specified by the set of test sets. We are interested in point sets with best possible upper bounds of the dispersion, which thus allow only small holes without any point. Of course, any estimate of depends on , and .
Classically, the dispersion of a point set was introduced by Hlawka Hl76 as the radius of the largest ball, with respect to some metric, which does not contain any point of . This quantity appears in the setting of quasi-Monte Carlo methods for optimization, see Ni83 and (Ni92, , Chapter 6). The notion of the dispersion from (1) was introduced by Rote and Tichy in RoTi96 to allow more general test sets. There the focus is on the dependence of (the cardinality of the point set) of . In contrast to that, we are also interested in the behavior with respect to the dimension.
There is a well known relation to the star-discrepancy, namely, the dispersion is a lower bound of this quantity. For further literature, open problems, recent developments and applications related to this topic we refer to DiPi10 ; DiRuZh13 ; Ni92 ; No15 ; NoWo10 .
For the test sets we focus on axis-parallel boxes. Point sets with small dispersion with respect to such axis-parallel boxes are useful for the approximation of rank-one tensors, see BaDaDeGr14 ; NoRu16 . In computational geometry, given a point configuration the problem of finding the largest empty axis-parallel box is well studied. Starting with NaLeHs84 for , there is a considerable amount of work for , see DuJi13 ; DuJi16 and the references therein. Given a large dataset of points, the search for empty axis-parallel boxes is motivated by the fact that such boxes may reveal natural constraints in the data and thus unknown correlations, see EdGrLiMi03 .
The minimal dispersion, given by
[TABLE]
quantifies the best possible behavior of the dispersion with respect to , and . Another significant quantity is the inverse of the minimal dispersion, that is, the minimal number of points with minimal dispersion at most , i.e.,
[TABLE]
By virtue of a result of Blumer, Ehrenfeucht, Haussler and Warmuth (BlEhHaWa89, , Lemma A2.1, Lemma A2.2 and Lemma A2.4) one obtains
[TABLE]
or stated differently
[TABLE]
where is the dyadic logarithm and denotes the VC-dimension111The VC-dimension is the cardinality of the largest subset of such that the set system contains all subsets of . of . The dependence on is hidden in the VC-dimension . For example, for the set of test sets of axis-parallel boxes
[TABLE]
it is well known that . However, the concept of VC-dimension is not as easy to grasp as it might seem on the first glance and it is also not trivial to prove upper bounds on depending on . For instance, for periodic axis-parallel boxes, which coincide with the interpretation of considering the torus instead of the unit cube, given by
[TABLE]
with
[TABLE]
the dependence on in is not obvious. The conjecture here is that behaves similar as , i.e., linear in , but we do not have a proof for this fact.
The aim of this paper is to prove an estimate similar to (2) based on the concept of a -cover of . For a discussion about -covers, bracketing numbers and VC-dimension we refer to Gn08 . Let be a set of measurable subsets of . A -cover for with is a finite set which satisfies
[TABLE]
such that . The main abstract theorem is as follows.
Theorem 1.1
For a set of test sets assume that for the set is a -cover of . Then
[TABLE]
The cardinality of the -cover plays a crucial role in the upper bound of the minimal dispersion. Thus, to apply the theorem to concrete sets of test sets one has to construct suitable, not too large, -covers.
For the best results on -covers we know are due to Gnewuch, see Gn08 . As a consequence of the theorem and a combination of (Gn08, , Formula (1), Theorem 1.15, Lemma 1.18) one obtains
Corollary 1
For and we have
[TABLE]
(For the trivial estimate applies.) In particular,
[TABLE]
Obviously, this is essentially the same as the estimates (2) and (3) in the setting of . Let us discuss how those estimates fit into the literature. From (AiHiRu15, , Theorem 1 and (4)) we know that
[TABLE]
where denotes the th prime. The upper bound is due to Larcher based on suitable -nets and for improves the super-exponential estimate of Rote and Tichy (RoTi96, , Proposition 3.1) based on the Halton sequence. The order of convergence with respect to is optimal, but the dependence on in the upper bound is exponential. In the estimate of Corollary 1 the optimal order in is not achieved, but the dependence on is much better. Already for it is required that must be larger than to obtain a smaller upper bound from (7) than from (5). By rewriting the result of Larcher in terms of the dependence on can be very well illustrated, one obtains
[TABLE]
Here, for fixed there is an exponential dependence on , whereas in the estimate of (6) there is a linear dependence on . Summarizing, according to the result of Corollary 1 reduces the gap with respect to , we obtain222After acceptance of the current paper a new upper bound of was proven in So17 . From So17 one obtains for that
with for . In particular, it shows that the lower bound cannot be improved with respect to the dimension. Note that the dependence on is not as good as in (6).
[TABLE]
As already mentioned for the estimates (2) and (3) are not applicable, since we do not know the VC-dimension. We construct a -cover in Lemma 2 below and obtain the following estimate as a consequence of the theorem. Note that, since , we cannot expect something better than in Corollary 1.
Corollary 2
For and we have
[TABLE]
*In particular, *
[TABLE]
Indeed, the estimates of Corollary 2 are not as good as the estimates of Corollary 1. By adding the result of Ullrich (Ul15, , Theorem 1) one obtains
[TABLE]
or stated differently,
[TABLE]
In particular, (10) illustrates the dependence on the dimension, namely, for fixed Corollary 2 gives, except of a term, the right dependence on .
In the rest of the paper we prove the stated results and provide a conclusion.
2 Auxiliary Results, Proofs and Remarks
For the proof of Theorem 1.1 we need the following lemma.
Lemma 1
For let be a -cover of . Then, for any point set with elements we have
[TABLE]
Proof
Let with . Then, there are with such that
[TABLE]
In particular, and
[TABLE]
Remark 1
In the proof we actually only used that there is a set with . Thus, instead of considering -covers it would be enough to work with set systems which approximate from below up to .
By probabilistic arguments similar to those of (BeCh87, , Section 8.1) we prove the main theorem. As in (HeNoWaWo01, , Theorem 1 and Theorem 3) for the star-discrepancy, it also turns out that such arguments are useful for studying the dependence on the dimension of the dispersion.
Proof of Theorem 1.1. By Lemma 1 it is enough to show that there is a point set which satisfies
[TABLE]
Let be a probability space and be an iid sequence of uniformly distributed random variables mapping from into . We consider the sequence of random variables as “point set” and prove that with high probability the desired property (11) is satisfied. For we have
[TABLE]
By the fact that and by choosing we obtain
[TABLE]
Thus, there exists a realization of , say , so that for the inequality (11) is satisfied. ∎∎
Remark 2
By Lemma 1 and the same arguments as in the proof of the theorem one can see that a point set of iid uniformly distributed random variables satisfies a “good dispersion bound” with high probability. In detail,
[TABLE]
In particular, for confidence level and
[TABLE]
the probability that the random point set has dispersion smaller than is strictly larger than . This implies
[TABLE]
where the dependence on is hidden in .
In the spirit of NoWo08 ; NoWo10 ; NoWo12 we are interested in polynomial tractability of the minimal dispersion, that is, may not grow faster than polynomial in and . The following corollary is a consequence of the theorem and provides a condition on the -cover for such polynomial tractability.
Corollary 3
For and the set of test sets let be a -cover satisfying
[TABLE]
Then, for one has
[TABLE]
Proof
Set in (4) and the assertion follows.
This implies the result of Corollary 1.
Proof of Corollary 1. By (Gn08, , Formula (1), Theorem 1.15, Lemma 1.18) one has
[TABLE]
Here the last inequality follows mainly by and the assertion is proven by Corollary 3 with , , . ∎∎
For we need to construct a -cover.
Lemma 2
For with and the set
[TABLE]
with
[TABLE]
is a -cover and satisfies .
Proof
For arbitrary with and there are
[TABLE]
such that
[TABLE]
Define and note that it is enough to find with and . For any coordinate we distinguish four cases illustrated in Figure 1:
Case: and :
Define and . (Here .) 2. 2.
Case: and :
Define and . (Here .) 3. 3.
Case: and :
Define and . 4. 4.
Case: and :
Define and .
In all cases we have as well as . For and the inclusion property with respect to does hold and
[TABLE]
By the choice of the right-hand side is bounded by and the assertion is proven.
Now we easily can prove an upper bound of the minimal dispersion according to as formulated in Corollary 2.
Proof of Corollary 2. By the previous lemma we know that there is a -cover with cardinality bounded by . Then by Corollary 3 with , and the proof is finished. ∎∎
3 Conclusion
Based on -covers we provide in the main theorem an estimate of the minimal dispersion similar to the one of (2). In the case where the VC-dimension of the set of test sets is not known, but a suitable -cover can be constructed our Theorem 1.1 leads to new results, as illustrated for . One might argue, that we only show existence of “good” point sets. However, Remark 2 tells us that a uniformly distributed random point set has small dispersion with high probability. As far as we know, an explicit construction of such point sets is not known.
Acknowledgements.
The author thanks Aicke Hinrichs, David Krieg, Erich Novak and Mario Ullrich for fruitful discussions to this topic.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) Aistleitner, C., Hinrichs, A., Rudolf, D.: On the size of the largest empty box amidst a point set. Discrete Appl. Math. 230 , 146–150 (2017)
- 2(2) Bachmayr, M., Dahmen, W., De Vore, R., Grasedyck, L.: Approximation of high-dimensional rank one tensors. Constr. Approx. 39 (2), 385–395 (2014)
- 3(3) Beck, J., Chen, W.: Irregularities of Distribution (Cambridge Tracts in Mathematics). Cambridge University Press (1987)
- 4(4) Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.: Learnability and the Vapnik-Chervonenkis dimension. J. Assoc. Comput. Mach. 36 (4), 929–965 (1989)
- 5(5) Dick, J., Pillichshammer, F.: Digital nets and sequences: Discrepancy Theory and Quasi-Monte Carlo Integration. Cambridge University Press, Cambridge (2010)
- 6(6) Dick, J., Rudolf, D., Zhu, H.: Discrepancy bounds for uniformly ergodic Markov chain quasi-Monte Carlo. Ann. Appl. Probab. 26 , 3178–3205 (2016)
- 7(7) Dumitrescu, A., Jiang, M.: On the largest empty axis-parallel box amidst n 𝑛 n points. Algorithmica 66 (2), 225–248 (2013)
- 8(8) Dumitrescu, A., Jiang, M.: Perfect vector sets, properly overlapping partitions, and largest empty box. Preprint, Available at https://arxiv.org/abs/1608.06874 (2016)
