Dynamically Defined Sequences with Small Discrepancy
Stefan Steinerberger

TL;DR
This paper introduces a novel greedy method for constructing sequences on [0,1] with small discrepancy, achieving bounds better than classical results and supported by numerical evidence.
Contribution
The paper presents a new sequence construction method that yields discrepancy bounds of order $( ext{log} N) N^{-1/2}$, improving upon classical bounds and conjecturing even smaller discrepancy.
Findings
Discrepancy bound $D_N oxed{ ext{approx.}} ( ext{log} N) N^{-1/2}$
Numerical evidence supports conjecture $D_N oxed{ ext{approx.}} ( ext{log} N) N^{-1}$
Extension of discrepancy bounds to higher dimensions with similar conjectures.
Abstract
We study the problem of constructing sequences on in such a way that is uniformly small. A result of Schmidt shows that necessarily for infinitely many and there are several classical constructions attaining this growth. We describe a type of uniformly distributed sequence that seems to be completely novel: given , we construct in a greedy manner We prove that and conjecture that . Numerical examples illustrate this conjecture in a very impressive manner. We also establish a discrepancy bound $D_N \lesssimβ¦
| 50 | 0.044 | 0.067 | 0.048 | 0.083 |
|---|---|---|---|---|
| 100 | 0.026 | 0.049 | 0.026 | 0.037 |
| 150 | 0.018 | 0.039 | 0.017 | 0.070 |
| 200 | 0.013 | 0.022 | 0.014 | 0.026 |
| 250 | 0.012 | 0.018 | 0.012 | 0.026 |
| 10 | 25 | 50 | 100 | 150 | 200 | |
|---|---|---|---|---|---|---|
| 0.32 | 0.12 | 0.06 | 0.032 | 0.022 | 0.016 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Dynamically Defined Sequences
with small discrepancy
Stefan Steinerberger
Department of Mathematics, Yale University, New Haven, CT 06511, USA
Abstract.
We study the problem of constructing sequences on in such a way that
[TABLE]
is small. A result of Schmidt shows that for all sequences sequences on we have for infinitely many , several classical constructions attain this growth. We describe a type of uniformly distributed sequence that seems to be completely novel: given , we construct in a greedy manner
[TABLE]
We prove that and conjecture that . Numerical examples illustrate this conjecture in a very impressive manner. We also establish a discrepancy bound for an analogous construction in higher dimensions and conjecture it to be .
Key words and phrases:
Low discrepancy sequence, energy functional.
2010 Mathematics Subject Classification:
11L03, 42B05, 82C22.
The author is supported by the NSF (DMS-1763179) and the Alfred P. Sloan Foundation.
1. Introduction
1.1. Introduction.
Let be a sequence on and define the star discrepancy of the first elements via
[TABLE]
van der Corput asked in 1935 whether there was a sequence for which . This was disproven by van Aardenne-Ehrenfest [1], Roth [19] showed that for any sequnce in , we have for infinitely many . The sharp result is due to Schmidt [20] who showed that for any sequence in there are infinitely many for which
[TABLE]
Oher proofs of Schmidtβs result were given by Bejian [4], Halasz [12] and Liardet [15], the best constant is due to Larcher & Puchhammer [14]. Several sequences attaining this growth have been constructed, we refer to the classical textbooks by Beck & Chen [3], Dick & Pillichshammer [10], Drmota & Tichy [11] and Kuipers & Niederreiter [13]. As soon as one generalizes the problem to sequences in higher dimensions using the notation and
[TABLE]
the problem of finding sharp lower bounds on the discrepancy is open. Roth [19] proved that any sequence in has
[TABLE]
An improvement by a double logarithmic factor for is due to Beck [2]. The best known result is due to Bilyk, Lacey, Vagharshakyan [6, 7] and states that for any sequence in
[TABLE]
and some depending only on . There is no consensus on what the sharp result should be: the two main conjectures (we refer to [5]) are that for any sequence there are infinitely many such that
[TABLE]
Of course, both conjectures coincide for . The first conjecture has the advantage of being structurally aligned with related conjectures in Harmonic Analysis and Probability Theory while the second conjecture has the advantage of being matched by the best known constructions. If the first conjecture were true, this would imply that in dimensions there are sequences more regular than anything we can currently construct. Many of these classical sequences attaining exploit regular structures derived from Number Theory (irrational rotations on the torus, regularity in digit expansions), so one could try to understand whether it is possible to construct sequences with small discrepancy using a different viewpoint.
1.2. Results.
This paper is a companion paper to [22] where we showed that minimizing a certain functional can decrease the discrepancy of point sets. Here we show that this functional also allows us to construct uniformly distributed sequences in a way that is very different from the usual constructions. Suppose we are given , we construct in a greedy manner
[TABLE]
If the minimizer is not unique, any choice is admissible. The gap condition ensures that the new point is not extremely close to any of the existing points. We could replace it by for any without it affecting the main result (except for constants). One can start with any given set and then obtain a sequence in this greedy manner.
Theorem 1**.**
We have, for any sequence thus constructed,
[TABLE]
where the implicit constant depends only on the initial set.
This bound in itself is not impressive (random points behave in a similar manner) but it is interesting that the outcome of such a greedy algorithm can be controlled at all. However, we believe that a much stronger statement is true: we conjecture that one can ignore the condition without fundamentally altering the sequence and that one will (independently of whether one ignores or not) obtain a low-discrepancy sequence.
Conjecture 1. For any initial set , the greedy sequence arising out of
[TABLE]
satisfies . A stronger conjecture would be that the implicit constant in does not depend on the initial set as .
If this statement were true, it would give rise to a large number of low-discrepancy sequences that are constructed by a technique very different from any of the usual ones. One byproduct of our argument is as follows.
Theorem 2**.**
Suppose we define a sequence in a greedy manner by picking in such a way that
[TABLE]
then
[TABLE]
We note that it is always possible to choose such a since
[TABLE]
Theorem 2 is not very deep and might be close to optimal; presumably there are various different choices of that are admissible and some of them might not be particularly good for the purpose of constructing low-discrepancy sequences (though, as Theorem 2 states, they cannot be arbitrarily bad either). The emphasis of our paper (as well as the numerical experimentation, see Β§1.3.) is that choosing the minimum may lead to very good behavior. In particular, an alternative sequence that may be interesting for further study could be
[TABLE]
The even more general case would be
[TABLE]
for one-periodic functions . There are two obvious questions: (1) are there certain functions that are particularly suited for producing regular sequences in this manner (even purely numerical results would be of interest) and (2) what can be proven about them? We emphasize that Pausingerβs theorem (see Β§1.3) suggests that there might be large families of functions resulting in sequences with very good distribution properties.
1.3. Basic Numerics.
One of the main reasons why Conjecture 1 seems reasonable is that in numerical examples, the sequence performs extraordinarily well. Given the first elements of a sequence , we will associate to it the set
[TABLE]
and use the star-discrepancy as a sign of quality (see Fig. 1).
Our sequence is actually comparable (or even superior) in quality to many of the classical constructions (see also [22]). We compare (see Table 1) the sequence with the Halton set (using base 2 and 3), the Hammersley sequence (using base 2) and the Kronecker-type set
[TABLE]
The choice of is more or less at random but was selected to give somewhat nice behavior (except for , see Table 1). We observe that our sequence, starting with (which was also more or less chosen at random), is comparable or superior to the other examples.
This behavior seems quite robust under various initial conditions. We could try to intentionally βbreakβ the sequence by starting with a particularly bad initial configuration. We observe that the sequence auto-adjusts in a nice way. We illustrate this below for the sequence starting with the initial set of points (see Fig. 2). In both cases we see that the newly added points initially avoid the clustered regions and then slowly return to it (though, initially, at a lower density, see Fig. 2). We refer to Table 2 for the behavior of their star discrepancy which is initially quite large (forced by the clustered initial points) and then stabilizes very quickly.
We observed numerically that certain initial conditions may be connected to variants of the van der Corput sequence in base 2. If we start with the set , then one admissible way of choosing minima leads to the sequence
[TABLE]
Can this be proven? Can admissible permutations of the van der Corput sequence, that can arise in this manner, be characterized?
Note added in print: this property has since been proven by Florian Pausinger [18] who established the following stronger result: if is symmetric, , and uniformly convex, then
[TABLE]
started with results in either the van der Corput sequence in base 2 or a permutated van der Corput sequence in base 2 (and these permutations can be precisely understood). Moreover, all these sequences have the property that has the same value for all of these sequences (i.e. depending only on , not on the particular permutation). In particular, as we conjecture to be in true in general, the condition is automatically enforced by the minimization problem.
1.4. Higher dimensions.
The construction rule for sequences in is slightly different: suppose we have constructed , then we want the next element to satisfy
[TABLE]
Integrating over shows that such a always exists.
Theorem 3**.**
Any such sequence satisfies
[TABLE]
where the implicit constant depends only on the initial set.
As in the one-dimensional case, we have the following
Conjecture 2. The greedy algorithm
[TABLE]
leads to a sequence with .
1.5. Outlook.
We have introduced the sequence
[TABLE]
shown that it is uniformly distributed and given some indication that it might be a low-discrepancy sequence. However, as also evidenced by the results of Pausinger [18], there is no reason to assume that there is anything particularly special about and similar results might be true at a much greater level of generality for large families of functions . Our particular function is required to show since it has a Fourier-analytic connection to the ErdΕs-TurΓ‘n inequality, it also has a natural connection to the fractional Laplacian (we refer to the companion paper [22] for details). Nonetheless, other functions may give rise to equally good constructions and, especially in higher dimensions, it is not at all clear what function could lead to the best results.
2. Proofs
2.1. A Lemma.
We start by proving a regularity statement for minimizers of the sum of logarithms. When we apply it to prove the main results, one of the relevant quantities is inside a logarithm. As a consequence, it is not tremendously important whether we prove Lemma 1 and Lemma 2 with bounds at scale or and thus we have not tried to optimize the arguments. However, a much stronger version of Lemma 1, in particular showing that the minimum is for many terms along the sequence actually at scale , could possibly improve the main result.
Lemma 1**.**
Let . Then there exists such that
[TABLE]
Proof of Lemma 1.
We introduce a one-parameter family of functions for via
[TABLE]
and note that
[TABLE]
is the solution of the heat equation starting with , in particular the maximum principle for parabolic equations is telling us that for any
[TABLE]
Moreover, by construction, for every
[TABLE]
We now establish a series of bounds on . We will work at scale but this is not important at this point. We first observe that
[TABLE]
We now use a basic Lemma of Montgomery [16] (we refer to the nice expositions in [17, Β§5.12] and Chazelle [9, Lemma 3.8.] as well as [8] for a recent refinement) ensuring that
[TABLE]
and obtain
[TABLE]
We next observe that
[TABLE]
and thus
[TABLE]
Altogether, abbreviating
[TABLE]
we have shown that
[TABLE]
We also note that arises from the forward evolution of the heat equation and is thus smooth. We will now use this to show that
[TABLE]
which we see as follows: clearly, from we observe that
[TABLE]
If the maximum is attained at a negative value of , we are done. If it is attained at a positive value, then the bound on the derivative implies
[TABLE]
which then, with the mean 0 condition, implies
[TABLE]
This implies the existence of a point such that
[TABLE]
This shows that the heat equation applied to yields a small value. We now argue that this means that the original function has to have a small value that is not particularly close to any of the points. We can write (identifying the unit interval with the Torus )
[TABLE]
where
[TABLE]
is the Jacobi function. The Jacobi function satisfies
[TABLE]
Using the easy estimate
[TABLE]
and defining
[TABLE]
we can estimate
[TABLE]
which implies as desired. β
There is a technical step that could be slightly improved. We observe that
[TABLE]
The exponential cutoff localizes the sum essentially at frequency scales which shows that we can expect the derivative to be (possibly up to a logarithmic factor) at scale as opposed to . However, this improvement would have no further impact on our main result.
2.2. An Error Bound
The second technical ingredient is straightforward.
Lemma 2**.**
For all and all , we have
[TABLE]
Proof.
We use summation by parts. Summation by parts states that if are two sequences, then
[TABLE]
We set
[TABLE]
Then
[TABLE]
It remains to estimate the supremum. We have, using ,
[TABLE]
β
The main consequence of Lemma 1 and Lemma 2 can now be written as follows.
Lemma 3**.**
Let be arbitrary and let
[TABLE]
Then, for any ,
[TABLE]
Proof.
This follows from Lemma 1, Lemma 2 and the decomposition
[TABLE]
β
2.3. Proof of Theorem 1 and Theorem 2
Proof.
Our derivation is motivated by the ErdΕs-Turan inequality bounding the discrepancy of a set by
[TABLE]
where is arbitrary. We can bound this from above by Cauchy-Schwarz
[TABLE]
We square the second term and decouple it into diagonal and off-diagonal terms
[TABLE]
Summing the points in pairs, we can simplify the double sum over and in the above expression to
[TABLE]
Altogether,
[TABLE]
Altogether, this shows that
[TABLE]
We now argue that the sum is negative because every sum (w.r.t. to ) is negative. Indeed, Lemma 3 implies that the choice, for every ,
[TABLE]
shows that
[TABLE]
This establishes the desired result. Theorem 2 follows from the same line of reasoning if we start from with instead of and keep all the trigonometric terms (as opposed to using error bounds to move to the logarithm). β
We note that using this particular way of taking a limit to obtain a Fourier series was already hinted at in earlier work of the author [21].
2.4. Proof of Theorem 3.
Proof.
We use the ErdΕs-Turan-Koksma inequality to bound the discrepancy of a set by
[TABLE]
where is given by
[TABLE]
The Cauchy-Schwarz inequality implies
[TABLE]
We rewrite the sum as
[TABLE]
However, this sum can also be written as (after additionally summing over and then subtracting the arising value )
[TABLE]
We now separate the diagonal terms and see that
[TABLE]
As in the proof of Theorem 1, we can reorder the sum and then use the fact that all the latter sums over negative. This shows that
[TABLE]
β
Acknowledgment. The author is grateful to two anonymous referees whose many suggestions greatly improved the quality of the manuscript.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] T. van Aardenne-Ehrenfest, Proof of the Impossibility of a Just Distribution of an Infinite Sequence Over an Interval, Proc. Kon. Ned. Akad. Wetensch. 48, 3-8, 1945.
- 2[2] J. Beck, A two-dimensional van Aardenne-Ehrenfest theorem in irregularities of distribution. Compositio Math. 72 3, 269β339 (1989).
- 3[3] J. Beck and W. Chen, Irregularities of Distribution, Cambridge Tracts in Mathematics (No. 89), Cambridge University Press, 1987.
- 4[4] R. Bejian, Minoration de la discrepance dβune suite quelconque sur T, Acta Arith. 41 (1982), no. 2, 185β202.
- 5[5] D. Bilyk, Rothβs Orthogonal Function Method in Discrepancy Theory and Some New Connections in the book βPanorama of Discrepancy Theoryβ, Lecture Notes in Math 2107 Springer Verlag, 2014. pp. 71β158.
- 6[6] D. Bilyk and M. Lacey, On the small ball Inequality in three dimensions, Duke Math. J. 143 (2008), no. 1, 81β115.
- 7[7] D. Bilyk, M. Lacey and A. Vagharshakyan, On the small ball inequality in all dimensions, J. Funct. Anal. 254 (2008), no. 9, 2470β2502.
- 8[8] D. Bilyk, F. Dai and S. Steinerberger, General and Refined Montgomery Lemmata, Math. Ann., to appear.
