
TL;DR
This paper proves that most self-similar measures on the line exhibit power decay in their Fourier transform at infinity, extending classical results to non-homogeneous cases with complex contraction ratios.
Contribution
It establishes Fourier decay for a broad class of self-similar measures, including non-homogeneous cases, outside a zero Hausdorff dimension set of parameters.
Findings
Most self-similar measures have Fourier transform decay at infinity.
Fourier decay holds outside a zero Hausdorff dimension exceptional set.
Extends classical results from homogeneous to non-homogeneous measures.
Abstract
We prove that, after removing a zero Hausdorff dimension exceptional set of parameters, all self-similar measures on the line have a power decay of the Fourier transform at infinity. In the homogeneous case, when all contraction ratios are equal, this is essentially due to Erd\H{o}s and Kahane. In the non-homogeneous case the difficulty we have to overcome is the apparent lack of convolution structure.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Fourier decay for self-similar measures
BORIS SOLOMYAK
Boris Solomyak, Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
Abstract.
We prove that, after removing a zero Hausdorff dimension exceptional set of parameters, all self-similar measures on the line have a power decay of the Fourier transform at infinity. In the homogeneous case, when all contraction ratios are equal, this is essentially due to Erdős and Kahane. In the non-homogeneous case the difficulty we have to overcome is the apparent lack of convolution structure.
Supported in part by the Israel Science Foundation grant 396/15.
1. Introduction
For a finite positive Borel measure on , consider the Fourier transform
[TABLE]
The behavior of the Fourier transform at infinity is an important issue in many areas of mathematics. The measure is called a Rajchman measure if . Riemann-Lebesgue Lemma says that absolutely continuous measures are Rajchman, but which singular measures are Rajchman is a subtle question with a long history, see [23]. For many purposes simple convergence of to zero is not enough, and some quantitative decay is needed.
Definition 1.1**.**
For let
[TABLE]
and denote . A measure is said to have power Fourier decay if .**
This property has a number of applications: for instance, if has power Fourier decay, then -almost every number is normal to any base, see [7, 29], and the support of has positive Fourier dimension, see [24].
In this paper we focus on the most basic class of “fractal measures,” namely, self-similar measures on the line.
Definition 1.2**.**
Let , , , and let be a probability vector. The Borel probability measure on satisfying
[TABLE]
is called self-similar, or invariant, for the iterated function system (IFS) , with the probability vector . It is well-known that there exists a unique such measure [12]. We assume that the fixed points are not all equal (otherwise, the measure is a point mass) and call the corresponding pairs non-trivial. We write if for all .**
Theorem 1.3**.**
For , there exists a set of zero Hausdorff dimension in such that for all , for all non-trivial and for all we have .
The theorem is an immediate consequence of the following:
Theorem 1.4**.**
Fix , and . Then there exist and , depending on these parameters, such that and for all , with
[TABLE]
* non-trivial, and satisfying , we have . *
We do not attempt to give specific quantitative estimates of the decay rate, although in principle, this is possible. Our proof gives extremely slow power decay.
Theorem 1.3 should be compared with a recent result of Li and Sahlsten, which was an inspiration for us.
Theorem 1.5** ([22]).**
Let be a self-similar measure with non-trivial and .
(i)* If there exist such that is irrational, then as .*
(ii)* If is Diophantine for some , then there exists such that*
[TABLE]
The methods are quite different: [22] uses an approach based on renewal theory, whereas we develop a multi-parameter generalization of the so-called “Erdős-Kahane argument”.
1.1. Background
The best-known case is homogeneous, when all the contraction ratios are equal: . There is a vast literature devoted to it, so we will be brief. An important class of examples is the family of Bernoulli convolutions , which is defined as the invariant measure for the IFS , with and probabilities . One of the original motivations for studying was the problem: for which is singular/absolutely continuous? (it follows from the “Law of Pure Type” that cannot be of mixed type [13]). Erdős [8] proved that as when is a Pisot number, hence the corresponding is singular. Recall that a Pisot number is an algebraic integer greater than one whose algebraic (Galois) conjugates are all less than one in modulus. Later Salem [32] showed that if is not a Pisot number, then as , thus providing a characterization of Rajchman Bernoulli convolution measures. In spite of the recent breakthrough results, see [11, 33, 34, 39, 38], the original problem of absolute continuity/singularity for is still open.
The first non-trivial result on absolute continuity of was obtained by Erdős [9] in 1940. In fact, he proved that for any there exists such that for a.e. . Using this and the convolution structure of , he deduced that is absolutely continuous for a.e. sufficiently close to 1. Later, Kahane [16] realized that Erdős’ argument actually gives that for all outside a set of zero Hausdorff dimension. (We should mention that only very few specific are known, for which has power Fourier decay, found by Dai, Feng, and Wang [6].) The Erdős-Kahane result plays an important role in the proof of absolute continuity for all outside of a zero Hausdorff dimension set by Shmerkin [33, 34]. The general homogeneous case is treated analogously to Bernoulli convolutions: the self-similar measure is still an infinite convolution and most of the arguments go through with minor modifications, see [6, 36]. An exposition of the “Erdős-Kahane argument” with quantitative estimates was given in [28], and then extended and generalized. Its variants were used in a number of recent papers in fractal geometry and dynamical systems, among them [36, 35, 10, 30, 15, 3, 4].
In the non-homogeneous not all contraction ratios are the same and the self-similar measure is not a convolution, which makes its study more difficult. First results on absolute continuity were obtained by Neunhäuserer [26] and Ngai and Wang [27]. In [30], joint with Saglietti and Shmerkin, we proved that, given a probability vector and vector of translations , with all components distinct, for a.e. in the “natural” parameter region (which depends on ), the self-similar measure is absolutely continuous. The proof was based on a decomposition of the self-similar measure into an integral of measures having a convolution structure, that are only statistically self-similar. A variant of the Erdős-Kahane argument was used to establish power Fourier decay for the latter (for all but a zero-dimensional set of parameters), but this was not sufficient to deduce any Fourier decay for the original self-similar measure. The methods of [30] were pushed further by Käenmaki and Orponen [15], but again, no conclusion was made for the Fourier decay of non-homogeneous self-similar measures.
We note (thanks to Pablo Shmerkin for bringing this to my attention) that a measure may have power decay outside of a sparse set of frequencies, even if it is not a Rajchman measure. In fact, Kaufman [17] (in the homogeneous case) and Tsujii [37] (in the non-homogeneous case) proved that for any non-trivial self-similar measure on the real line, for any there exists such that the set
[TABLE]
can be covered by intervals of length . Mosquera and Shmerkin [25] made the dependence of on quantitative in the homogeneous case. The papers [17, 25] use a version of the Erdős-Kahane argument, whereas the proof in [37] is based on large deviation estimates.
The study of Fourier decay for other classes of dynamically defined measures has been quite active recently. We only mention a few papers, without an attempt to be comprehensive. Jordan and Sahlsten [14] obtained power Fourier decay for Gibbs measures for the Gauss map, using methods from dynamics and number theory. Bourgain and Dyatlov [2] established Fourier decay for Patterson-Sullivan measures associated to a convex co-compact Fuchsian group, using methods from additive combinatorics; see also [31, 20]. Li [19] proved that the stationary measure for a random walk on has power decay, when the support of the driving measure generates a Zariski dense subgroup, following his earlier work [18] showing that such a measure is Rajchman. He initiated the approach based on renewal theory, which was later used by Li and Sahlsten [22] to prove Theorem 1.5. Recently the same authors extended their result to a class of self-affine measures in in [21].
The rest of the paper is devoted to the proof of Theorem 1.4. As already mentioned, it is based on a generalization of the Erdős-Kahane argument, but there are many new features, mainly because we have to deal with multi-parameter families.
2. Reduction
In view of being non-trivial, by a linear change of variable we can fix two translation parameters, for instance, and arbitrary (it can even depend on the other parameters; this would only change the scale on the -axis, but would not affect the rate of decay of the Fourier transform). After that, we will pass to a higher iterate of the IFS, which preserves the invariant measure. The reason for doing this is to obtain an IFS with many maps having the same contraction ratio (in fact, the number of maps , grows exponentially with , whereas the number of distinct contraction ratios grows polynomially). In this sense, the proof resembles the strategy of the proof in [30], although in other aspects it is very different. We now formulate the main technical result.
Theorem 2.1**.**
Let , and consider the IFS
[TABLE]
where is the set of distinct contraction ratios, (so the number of maps in the IFS is strictly greater than ), is a vector of translations, and is a probability vector. Let be the corresponding self-similar measure. Fix and . Assume that
[TABLE]
Let be such that
[TABLE]
Then there exist and , depending on , such that and for all , satisfying (2.2), for all such that
[TABLE]
*and all such that , we have . *
Derivation of Theorem 1.4 from Theorem 2.1.
As already mentioned, we may assume that the original IFS has , with arbitrary (we do not exclude the case ). Passing to the -th iterate, we obtain an IFS with the number of maps equal to and the number of distinct contractions less than or equal to
[TABLE]
which is the number of ways to write as a sum of non-negative integers. Among the maps of the new IFS there are
[TABLE]
This way, we can let , , so that , and choose to satisfy (2.4). Denote the new IFS by . Since the invariant measure remains unchanged when we pass to a higher iterate of the IFS, we have . The bounds for inverses of the contraction ratios of are
[TABLE]
and the probabilities satisfy . In order to satisfy (2.3), we need
[TABLE]
Since , it is enough to choose so that
[TABLE]
which is certainly possible. Now we apply Theorem 2.1 and obtain an exceptional set of Hausdorff dimension , such that for all , for all vectors of translations satisfying (2.4) and all probability vectors , with , holds , for some .
It remains to observe that we can recover from via a function which does not increase Hausdorff dimension. For instance, among the contraction ratios of there are . We can project to these coordinates and then take -th root component-wise, to obtain . This map is Lipschitz outside of the neighborhood of zero of radius . We obtain an exceptional set of as an image of under this map, and . For all , all satisfying , and all probability vectors , with , we have . This completes the proof of the derivation. ∎
The rest of the paper is devoted to the proof of Theorem 2.1.
3. Beginning of the Proof
We consider the Fourier transform , where is the invariant measure for the IFS (2.1), that is,
[TABLE]
It follows that
[TABLE]
We can estimate
[TABLE]
Denote
[TABLE]
Recall that by assumption, and use an elementary inequality
[TABLE]
We then obtain from (3.1), denoting by the distance from to the nearest integer:
[TABLE]
using that .
Next we introduce some notation. Let . For a word let be the number of ’s in , and let . For we will write
[TABLE]
where . (Note that ; hopefully, this will not cause a confusion; in any case, we do not need any more.) Further, let be the prefix of of length ; if , this is empty word, by convention.
Iterating (3.2) we obtain
[TABLE]
Notation 3.1**.**
We will consider as the vertex set of a directed graph, with a directed edge going from to each of , where is ’th unit vector. We will then write . A vertex is a descendant of of level if there is a path of length from to (the length of a path is the number of edges). We will identify a word with a path of length in , formed by the sequence of vertices and denote this path by . It is clear that for .
We will write if is a descendant of and . Equivalently, iff for some, possibly empty, subset . Thus implies that either , or is a descendant of of level . **
Definition 3.2**.**
Let , , and . Say that a vertex is -good if and
[TABLE]
(recall that denotes the distance from the nearest integer).
Further, say that a vertex is “on a -good track”* if there exists that is -good and .*
Finally, we say that an edge is -good if is -good and . (Notice that the 1-st coordinate, corresponding to , is “special” by construction, see (2.4) and (3.3).)
Consider . Then for all , by the assumption . It follows from (3.3), roughly speaking, that in order to have a power decay for for at this scale, it is sufficient that for “most” (up to exponentially small number) words there is a fixed positive proportion of -good edges on the path corresponding to , for some . With this in mind, we define the exceptional set of at scale as follows:
Definition 3.3**.**
Fix and , and let be the set of such that there exists and a word with the properties:
[TABLE]
Further, we define the exceptional set by
Let
[TABLE]
Theorem 2.1 will follow, once we prove the next two propositions:
Proposition 3.4**.**
For all sufficiently large, there exists , depending on , such that for all , for all , with , and all satisfying (2.4), we have .
Proposition 3.5**.**
For all sufficiently large we have .
4. Fourier decay for non-exceptional
Fix , where is given by (3.5) and is fixed, sufficiently large. (A specific value for will be chosen in (5.6).) Then for all sufficiently large. Fix such an . The condition means, by definition, that for every and for every , the number of vertices “on a -good track” on the path is greater than . Fix . Since , , and are now fixed, we will omit when talking about vertices and edges that are good or “on a good track”.
We will consider as a probability space, with the Bernoulli measure , and the “random environment” provided by the configuration of good vertices and edges. Let . Let us introduce the following random variables for :
- •
for , is the number of vertices on the path having a good vertex among its -level descendants;
- •
is the number of good vertices on the path ;
- •
is is the number of good edges on the path .
Notice that for every vertex of that is “on a good track”, there is a vertex of that had a good vertex among its -level descendants, and this mapping is at most -to-. It follows that, with probability one,
[TABLE]
Lemma 4.1**.**
There exist , , and , depending only on and , such that, assuming is sufficiently large (depending only on and ), holds
[TABLE]
We first deduce power Fourier decay for from the lemma.
Proof of Proposition 3.4.
Consider the sum in the inequality (3.3) and split it according to whether or . The sum over such that , is bounded by . If is such that , then the corresponding term in the right-hand side of (3.3) is estimated from above by \bigl{(}1-\frac{\pi{\varepsilon}}{2}\rho^{2}\bigr{)}^{\delta N}, by the definition of a good edge (we also use the fact that , since is a probability measure). Then Lemma 4.1 implies, for sufficiently large:
[TABLE]
Since was arbitrary, sufficiently large, and arbitrary in , this implies that for some . ∎
As a step in the proof of Lemma 4.1, we will first establish the following
Lemma 4.2**.**
There exist , , and , for , such that, for sufficiently large,
[TABLE]
Proof of Lemma 4.2.
We will show this by induction in , going from to . For the claim trivially holds, by (4.1). Fix and assume that (4.3) holds for . Consider the sequence of random variables
[TABLE]
We claim that this a submartingale; in fact,
[TABLE]
Indeed, we have either (a) , or (b) . The former case occurs when has no good descendants of level , and then has no good descendants of level . Thus in case (a) we have and .
In case (b), on the other hand, has a good descendant of level , and then has a good descendant of level , with probability , independently of the past. Then either or , hence or .
Formally, we obtain
[TABLE]
Since , we have
[TABLE]
confirming the claim that is a submartingale.
We are going to apply the Azuma-Hoeffding inequality, which says that, given that is a submartingale, if for all , then
[TABLE]
See, e.g., [1] for the (two-sided) Azuma-Hoeffding inequality for martingales. The one-sided inequality for submartingales is proved similarly, see e.g., [5].
We have , hence taking yields
[TABLE]
Since is bounded, we have for sufficiently large:
[TABLE]
Recall that , and
[TABLE]
by the inductive assumption. Therefore for sufficiently large,
[TABLE]
and (4.3) follows. ∎
Proof of Lemma 4.1.
Consider the sequence of random variables
[TABLE]
We claim that is a martingale; in fact,
[TABLE]
This is proved analogously to the proof of the submartingale property for above. If is not a good vertex, then the edge , with , is not good either, and we have . If, other other hand, is a good vertex, then the edge , with , is good with probability , and this is independent from the past. Thus, if , then
[TABLE]
This implies (4.6); the formal computation, similar to the above, is left to the reader.
Applying the Azuma-Hoeffding inequality to , in view of , after a computation similar to that above, we can estimate, for large enough, using (4.3) for :
[TABLE]
This implies the desired estimate (4.2). ∎
5. Dimension of the exceptional set
Fix . This means that for infinitely many . Fix such an , sufficiently large. We will show that this imposes constraints on allowing us to construct a good cover of . By definition of , there exists and a word , such that the number of vertices “on a good -track” on the path does not exceed . Fix such a and , and for let
[TABLE]
that is, is the nearest integer to and . One should keep in mind that and depend on and , but we suppress this in notation to reduce “clutter”.
The next lemma is analogous to the ones appearing in other variants of the Erdős-Kahane argument; see e.g., [28, Lemma 6.3].
Lemma 5.1**.**
Let be given by (3.5) and
[TABLE]
Let , for some (in particular, ), such that . The following hold:
(i)* Given and , there are at most possibilities for .*
(ii)* Given and , the number is uniquely determined, provided*
[TABLE]
that is, provided none of the is -good.
Proof.
We have, by assumption,
[TABLE]
The idea is that
[TABLE]
hence must be not too far from . First note that
[TABLE]
where we used the bound . Next,
[TABLE]
Therefore,
[TABLE]
using (5.3) in the last step. Now both parts of the lemma follow easily. Indeed, is an integer.
(i) Since , once and are given, there are at most possibilities for , see (5.1).
(ii) The choice of will be unique, provided
[TABLE]
see (3.5). ∎
Corollary 5.2**.**
Suppose that , and we are given for all such that ; assume that all of them satisfy . Then
(i)* for any , such that , there at most possibilities for ;*
(ii)* for any , such that , assuming that neither , nor any of , with , is -good, is uniquely determined.*
Proof.
Fix such that . Then , for some . We have ; suppose for some . If , then , so and is already known by assumption.
If , then and satisfy
[TABLE]
Moreover, so and are already given, and we are exactly in the situation of Lemma 5.1. Applying the lemma yields the desired result. ∎
Proof of Proposition 3.5.
Let be maximal, such that
[TABLE]
Note that
[TABLE]
Let
[TABLE]
be the -th vertex on the path corresponding to the word , so that . We have chosen in such a way that
[TABLE]
Recall that is fixed and the number of vertices “on a good -track” on the path does not exceed . We are going to estimate from above the number of possible configurations of integers , where for some . Note that for any there are vertices such that .
We start with the “initial configuration” of for such that . By the choice of we have
[TABLE]
so for all such that . It follows that the total number of possibilities for for all such that , is at most
[TABLE]
Now we follow the path backwards, applying Corollary 5.2 at each step. Fix . Part (i) of the corollary says that for any , with , there are at most choices for , once all the , with are determined. Part (ii) of the corollary says that if none of , are on a “good -track”, those are determined uniquely. By assumption, there are no more that vertices of that are “on a good -track”, hence this will affect at most transitions between and . On each transition, we determine at most “new” values of (this is an “overcount,” but we do not try to be precise here). If we fix the subset of corresponding to the vertices “on a good -track” on the path , we will obtain at most
[TABLE]
total configurations, where . Taking into account all the possibilities for the subset in question and also possible values of yields that the total number of configurations of , for all under consideration, is at most
[TABLE]
Next, note that the knowledge of all associated with the path gives a good approximation of . In fact, we have for , so and are among the “known” ones. Estimating as in Lemma 5.1, we have
[TABLE]
and . It follows that the knowledge of all associated with the path gives a cover of the exceptional by balls of diameter . Taking into account that the number of words is equal to , we obtain that the exceptional set at scale may be covered by
[TABLE]
balls of diameter . Recall that by (2.3). Thus we can choose such that
[TABLE]
Then
[TABLE]
whence , as desired. ∎
Acknowledgement. I am grateful to Ori Gurel-Gurevich for his help with the probabilistic argument, and to Tuomas Sahlsten for helpful discussions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Noga Alon and Joel H. Spencer. The probabilistic method . Wiley-Interscience Series in Discrete Mathematics and Optimization. John Wiley & Sons, Inc., New York, 1992. With an appendix by Paul Erdős, A Wiley-Interscience Publication.
- 2[2] Jean Bourgain and Semyon Dyatlov. Fourier dimension and spectral gaps for hyperbolic surfaces. Geom. Funct. Anal. , 27(4):744–771, 2017.
- 3[3] Alexander I. Bufetov and Boris Solomyak. On the modulus of continuity for spectral measures in substitution dynamics. Adv. Math. , 260:84–129, 2014.
- 4[4] Alexander I. Bufetov and Boris Solomyak. The Hölder property for the spectrum of translation flows in genus two. Israel J. Math. , 223(1):205–259, 2018.
- 5[5] Fan Chung and Linyuan Lu. Concentration inequalities and martingale inequalities: a survey. Internet Math. , 3(1):79–127, 2006.
- 6[6] Xin-Rong Dai, De-Jun Feng, and Yang Wang. Refinable functions with non-integer dilations. J. Funct. Anal. , 250(1):1–20, 2007.
- 7[7] H. Davenport, P. Erdős, and W. J. Le Veque. On Weyl’s criterion for uniform distribution. Michigan Math. J. , 10:311–314, 1963.
- 8[8] Paul Erdős. On a family of symmetric Bernoulli convolutions. Amer. J. Math. , 61:974–976, 1939.
