Estimating the Frequency of a Clustered Signal
Xue Chen, Eric Price

TL;DR
This paper develops new methods for accurately estimating the central frequency of clustered signals with Fourier spectrum in a narrow band, improving bounds on recovery accuracy and establishing fundamental limits.
Contribution
It introduces generic conditions for frequency estimation, improves bounds for $k$-Fourier-sparse signals, and provides a new ratio bound with independent applications.
Findings
Achieves frequency recovery within $ ilde{O}(k^3)$ error bound.
Improves previous bounds from $O( ilde{O}(k^5)^{1.5})$ to $ ilde{O}(k^3)$.
Establishes a lower bound of $ ilde{O}(k^2)$ for frequency estimation accuracy.
Abstract
We consider the problem of locating a signal whose frequencies are "off grid" and clustered in a narrow band. Given noisy sample access to a function with Fourier spectrum in a narrow range , how accurately is it possible to identify ? We present generic conditions on that allow for efficient, accurate estimates of the frequency. We then show bounds on these conditions for -Fourier-sparse signals that imply recovery of to within from samples on . This improves upon the best previous bound of . We also show that no algorithm can do better than . In the process we provide a new bound on the ratio between the maximum and average value of continuous -Fourier-sparse signals, which has independent application.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\includeversion
version:full \excludeversionversion:short
Estimating the Frequency of a Clustered Signal
Xue Chen
Northwestern University Supported by research funding from Northwestern University. Part of this work was done while the author was in the University of Texas at Austin supported by NSF Grant CCF-1526952 and a Simons Investigator Award (#409864, David Zuckerman).
Eric Price
The University of Texas at Austin Supported in part by NSF Award CCF-1751040 (CAREER).
Abstract
We consider the problem of locating a signal whose frequencies are “off grid” and clustered in a narrow band. Given noisy sample access to a function with Fourier spectrum in a narrow range , how accurately is it possible to identify ? We present generic conditions on that allow for efficient, accurate estimates of the frequency. We then show bounds on these conditions for -Fourier-sparse signals that imply recovery of to within from samples on . This improves upon the best previous bound of O\big{(}\Delta+\widetilde{O}(k^{5})\big{)}^{1.5}. We also show that no algorithm can do better than .
In the process we provide a new bound on the ratio between the maximum and average value of continuous -Fourier-sparse signals, which has independent application.
1 Introduction
A natural question, dating at least to the work of Prony in 1795, is to estimate a signal from samples, assuming the signal has a -sparse Fourier representation, i.e., that the signal is a sum of complex exponentials: for some set of frequencies and coefficients .
If the frequencies are located on a discrete grid (giving a sparse discrete Fourier transform), then a long line of work has studied efficient algorithms for recovering the signal (e.g., [11, 7, 1, 8, 9, 10]). If the frequencies are not on a grid, then Prony’s method from 1795 [13] or matrix pencil [3] can still identify them in the absence of noise. With noise, however, one cannot robustly recover frequencies that are too close together: if one listens to a signal for the interval then any two frequencies and will be -close to each other, and so cannot be distinguished with noise. As shown in [12], this nonrobustness grows exponentially in . On the other hand, [12] also showed that recovery with polynomially small noise is possible if all the frequencies have separation , and [14] showed that a constant fraction of noise is tolerable with separation , where is the bandlimit of the signal.
So what is possible for arbitrary Fourier-sparse signals, without any assumption of frequency separation? One cannot hope to identify the frequencies exactly, but one can still estimate the signal itself. If two frequencies are similar enough to be indistinguishable over the sampled interval, we do not need to distinguish them. In [5], this led to an algorithm for an arbitrary -Fourier-sparse signal that used samples to estimate it with only a constant factor increase in the noise. However, this polynomial is fairly poor.
Since prior work could handle the case of well-separated frequencies, a key challenge in [5] is the setting with all the frequencies in a narrow cluster. Formally, consider the following subproblem: if all the frequencies of the signal lie in a narrow band , how accurately can we estimate ? Note that while we would like an efficient algorithm that takes a small number of samples, the key question is information theoretic. And we can ask this question more generally: if the signal is not -sparse, but still has all its frequencies in a narrow band, can we locate that band?
Question 1.1**.**
Let be a signal with Fourier transform supported on , for some . Suppose that we can sample from at points in , where could be any bounded noise on with
[TABLE]
for a small constant . Under what conditions on can we estimate , and how accurately?
One might expect to be able to estimate to for all functions ; after all, is just a combination of individual frequencies, each of which points to some frequency in the right range, and each individual frequency in isolation can be estimated to within in the presence of noise. Unfortunately, this intuition is false.
To see this, consider the family of -sparse Fourier functions with , i.e.,
[TABLE]
By sending and taking a Taylor expansion, this family can get arbitrarily close to any degree polynomial, on any interval . Thus, to solve the question, one would also need to solve it when is a polynomial even for arbitrarily small .
There are two ways in which being a degree polynomial can lead to trouble. The first is that could itself be a Taylor expansion of . If , this Taylor approximation will be quite accurate on ; with the noise , the observed signal can equal . Thus the algorithm has to output , which can be far from the “true” answer .
The second way in which can lead to trouble is by removing most of the signal energy. If is the (slightly shifted) Chebyshev polynomial g(t)=T_{d}\big{(}t/T+O(\frac{\log^{2}d}{d^{2}})\big{)}, then for t\leq\big{(}1-O(\frac{\log^{2}d}{d^{2}})\big{)}T, while for t\geq\big{(}1-O(\frac{\log^{2}d}{d^{2}})\big{)}T. That is to say, the majority of the energy of can lie in the final fraction of the interval. In such a case, a small constant noise level can make samples outside that size region equal to zero, and hence completely uninformative; and samples in that region still have to tolerate noise. This leads to an “effective” interval size of , leading to accuracy .
Our main result is that, in a sense, these two types of difficulties are the only ones that arise. We can measure the second type of difficulty by looking at how much larger the maximum value of is than its average:
[TABLE]
We can measure the former by observing that while a polynomial may approximate a complex exponential on a bounded region, as the polynomial will blow up. In particular, we take the such that
[TABLE]
for all . We show that if and are bounded, one can estimate to within , which is almost tight from the above discussion of polynomials. Moreover, the time and number of samples required are fairly efficient:
Theorem 1.2**.**
Given any , and , let be a signal with the following properties:
* where .* 2. 2.
\underset{t\in[-T,T]}{\sup}\big{[}|g(t)|^{2}\big{]}\leq R\cdot\underset{t\in[-T,T]}{\operatorname*{\mathbb{E}}}\big{[}|g(t)|^{2}\big{]}. 3. 3.
* grows as at most \mathsf{poly}(R)\cdot\underset{t\in[-T,T]}{\operatorname*{\mathbb{E}}}\big{[}|g(t)|^{2}\big{]}\cdot|\frac{t}{T}|^{S} for .*
Let be the observable signal on , where \underset{t\in[-T,T]}{\operatorname*{\mathbb{E}}}\big{[}|\eta(t)|^{2}\big{]}\leq\epsilon\cdot\underset{t\in[-T,T]}{\operatorname*{\mathbb{E}}}\big{[}|g(t)|^{2}\big{]} for a sufficiently small constant . For and any , there exists an efficient algorithm that takes samples from and outputs satisfying with probability at least .
Application to sparse Fourier transforms
Specializing to -Fourier-sparse signals, we give bounds on and for this family. Since (as described above) this family can approximate degree- polynomials, we know that and ; we show that and . Thus, whenever is between and , we can identify -Fourier-sparse signals to within . This is an improvement over the results in [5] in several ways.
Formally, for a given sparsity level , we consider signals in
[TABLE]
Theorem 1.3**.**
For any and ,
[TABLE]
It was previously known that [5], and this fact was used in [2]. (Thus, our improved bound on immediately implies an improvement in Theorem 8 of [2], from to .)
Next we bound the growth for any .
Theorem 1.4**.**
*There exists such that for any and , .
This is analogous to Theorem 5.5 of [5], which proves a bound of rather than . These bounds are incomparable, but the bound is actually more useful for this problem: what really matters is showing that is not too large just outside the interval. Theorem 1.4 gives the “correct” polynomial dependence at .
We can now apply Theorem 1.2 to get an efficient algorithm to recover the center of a cluster of frequencies within accuracy .
Theorem 1.5**.**
Given and , let be the ratio between the maximum and average value of continuous -Fourier-sparse signals defined in (1). Given , let be a -Fourier-sparse signal centered around : where each and be the observable signal on , where \underset{t\in[-T,T]}{\operatorname*{\mathbb{E}}}\big{[}|\eta(t)|^{2}\big{]}\leq\epsilon\cdot\underset{t\in[-T,T]}{\operatorname*{\mathbb{E}}}\big{[}|g(t)|^{2}\big{]} for a sufficiently small constant .
For any , there exist and an efficient algorithm that takes samples from and outputs satisfying with probability at least .
Note that the sample complexity here is not . This is because, based on the structure of the problem, we can use a nonuniform sampling procedure that performs better. Otherwise this theorem is just Theorem 1.2 applied to the and from Theorems 1.3 and 1.4.
Theorem 1.5 is a direct improvement on Theorem 7.5 of [5], which for could estimate to within accuracy and used samples. In particular, in addition to improving the additive term, our result avoids a multiplicative increase in the bandwidth of .
The main technical lemma in proving Theorems 1.2 and 1.5 is a filter function with a compact supported Fourier transform that simulates a box function on for any satisfying the conditions in Theorem 1.2.
Lemma 1.6**.**
Given any , , and , there exists a filter function with \big{|}\mathsf{supp}(\widehat{H})\big{|}\leq\frac{\tilde{O}(R+S)}{T} such that for any satisfying the second and third conditions in Theorem 1.2,
* is close to a box function on : .* 2. 2.
The tail of is small:
Organization
We introduce some notation and tools in Section 2. Then we provide a technical overview in Section 3. We show our filter function and prove Lemma 1.6 in Section 4. Next we present the algorithm about frequency estimation of Theorem 1.2 in Section 5. Finally we prove the results about sparse Fourier transform — Theorem 1.3 and Theorem 1.4 in Section 6.
2 Preliminaries
In the rest of this work, we fix the observation interval to be and define
[TABLE]
because we could rescale to and to .
We first review several facts about the Fourier transform. The Fourier transform of an integrable function is
[TABLE]
We use to denote the pointwise dot product and to denote . Similarly, we use to denote the convolution of and : . In this work, we always set as the convolution . Notice that and .
We define the box function and its Fourier transform function as follows. Given a width , the box function iff ; and its Fourier transform is for any .
We state the Chernoff bound for random sampling [4].
Lemma 2.1**.**
Let be independent random variables in with expectation . For any and , with expectation 1 satisfies
[TABLE]
3 Proof Overview
We first outline the proofs of Lemma 1.6 and Theorem 1.2. Then we show the proof sketch of and of -Fourier-sparse signals.
The filter functions in Lemma 1.6.
Ideally, to satisfy the two claims in Lemma 1.6, we could set to be the box function on . However, by the uncertainty principle, it is impossible to make its Fourier transform compact using such an . Hence our construction of is in the inverse direction: we build by box functions and by the Fourier transform of box functions — the sinc function. In the rest of this discussion, we focus on using the sinc function to prove Lemma 1.6 given the properties of in Theorem 1.2.
We first notice that any with the following two properties is effective in Lemma 1.6 for satisfying for any and for :
for any of a large constant . This shows
[TABLE]
Because for any , the constant on the R.H.S. is at least , which implies the first claim of Lemma 1.6. 2. 2.
declines to for any . This shows
[TABLE]
which implies the second claim.
For ease of exposition, we start with . We plan to design a filter with compact dropping from at to at in a small range using the sinc function. To apply the sinc function, we notice that
[TABLE]
decays from 1 at to at , which matches the dropping of from to .
Then, to make for any , let us consider a convolution of and . Because most of the mass of the latter is in , this convolution keeps almost the same value in and drops down to at . At the same time, it will keep the compactness of since it corresponds to the dot product on the Fourier domain. By normalizing and scaling, this gives the desired for .
Next we describe the construction of . The high level idea is to consider the decays of in segments rather than one segment of :
[TABLE]
For each segment, we provide a power of sinc functions matching its decay in like the construction of on . The final construction is the convolution of the dot product of all sinc powers and a box function, which appears in Section 4.
The Algorithm of Theorem 1.2.
Now we show how to estimate given the observable signal where and (with norm taken over defined in (2)). We instead consider with the filter function from Lemma 1.6 and the corresponding dot products and . The starting point is that for a sufficiently small , we expect
[TABLE]
because has Fourier spectrum concentrated around . This does not hold for all , but it does hold on average:
[TABLE]
This is because we can use Parseval’s identity to replace these integrals by an integral over Fourier domain—Parseval’s identity would apply if the integrals were from to , but because of the filter function , relatively little mass in lies outside . Then, the Fourier transform of the term inside the left square is . Note that has most of its mass in for , and every such frequency shrinks in the left by a factor . Thus, for , (3) holds.
To learn through , we design a sampling procedure to output satisfying
[TABLE]
Even though the above discussion shows the left hand side is smaller than the R.H.S. on average, a uniformly random may not satisfy it with good probability: may be only true for fraction of , while the corruption by adversarial noise has for a constant . At the same time, even for many points where some of them satisfy the above inequality, it is infeasible to verify such an given is unknown. We provide a solution by adopting the importance sampling: for random samples , we output with probability proportional to the weight .
We prove the correctness of this sampling procedure in Lemma 5.2 in Section 5.
Finally, learning is not enough to learn : because of the noise, we only learn to within a constant , which gives to within ; and because of the different branches of the complex logarithm, this is only up to integer multiples of . Therefore to fully learn , we repeat the sampling procedure at logarithmically many different scales of , from to .
-Fourier-sparse signals.
Finally, we show and such that for any — not necessarily one with the clustered together—
[TABLE]
We first review the previous argument of [5]. The key point is to show for some that is a linear combination of using bounded integer coefficients for any . Then
[TABLE]
If we think of as the supremum and as the average —which we can formally do up to logarithmic factors by averaging over —this shows . One natural idea to improve it is to use a smaller value and a shorter linear combination [6]. However, for such a combination when is approximately the degree Chebyshev polynomial. In this work, we use a geometric sequence to control such that instead of , which provides an improvement of a factor on .
Then we bound for at . The intuition is that given (4) holds for any in terms of with , it implies for . Combining this with an alternate bound for , it completes the proof of Theorem 1.4 about .
Finally we notice that we could improve the sample complexity in Theorem 1.5 to using a biased distribution [6] to generate . These results about -Fourier-sparse signals appear in Section 6.
4 Our Filter Function
The main result is an explicit filter function with compact support that is close to the box function on for any satisfying the conditions in Theorem 1.2.
We show our filter function as follows.
Definition 4.1**.**
Given , the growth rate and an even constant , we define the filter function
[TABLE]
where is a parameter to normalize . On the other hand, its Fourier transform is
[TABLE]
whose support size is .
We prove Lemma 1.6 using with a large constant and a scale parameter . For convenience, we state the full version of Lemma 1.6 for as follows.
Theorem 4.2**.**
Let , let be a large even constant, and define . Consider any function satisfying the following two conditions:
** 2. 2.
And for ,
Then the filter function H\big{(}\alpha x\big{)} is such that H\big{(}\alpha x\big{)}\cdot g(x) satisfies
\int_{-1}^{1}|g(x)\cdot H\big{(}\alpha x\big{)}|^{2}\mathrm{d}x\geq 0.9\int_{-1}^{1}|g(x)|^{2}\mathrm{d}x. 2. 2.
\int_{-1}^{1}|g(x)\cdot H\big{(}\alpha x\big{)}|^{2}\mathrm{d}x\geq 0.95\int_{-\infty}^{\infty}|g(x)\cdot H\big{(}\alpha x\big{)}|^{2}\mathrm{d}x.** 3. 3.
* for any .*
{version:full}
For completeness, we show a few properties of and finish the proof of Theorem 4.2 in Appendix 7.
{version:short}
Due to the space constraint, we defer the proof of Theorem 4.2 to the full version.
5 Frequency Estimation
We show the algorithm for frequency estimation and prove Theorem 1.2 in this section. We fix and use the definition to restate the theorem.
Theorem 5.1**.**
Given any , and , let be a signal with the following properties:
* where .* 2. 2.
\underset{t\in[-1,1]}{\sup}\big{[}|g(t)|^{2}\big{]}\leq R\cdot\|g\|_{2}^{2}. 3. 3.
* grows as at most for .*
Let be the observable signal on , where for a sufficiently small constant . For and any , there exists an efficient algorithm that takes samples from and outputs satisfying with probability at least .
For convenience, we set for any signal with the filter function defined in Theorem 4.2 such that .
Given the observation with most Fourier mass concentrated around , the main technical result in this section is an estimation of through .
Lemma 5.2**.**
Given parameters , and , let be a signal satisfying the three conditions in Theorem 1.2 for some and .
Let be the observable signal on where the noise for a sufficiently small constant . There exist a constant and an algorithm such that for any , it takes samples to output satisfying with probability at least 0.6.
We show our algorithm in Algorithm 1. We finish the proof of Theorem 1.5 here and defer the proof of Lemma 5.2 to Section 5.1.
Proof of Theorem 5.1. From Lemma 5.2, gives a good estimation of with probability 0.6 for any . We use the frequency search algorithm of Lemma 7.3 in [5] with the sampling procedure in Lemma 5.2. Because the algorithm in [5] uses the sampling procedure times to return a frequency satisfying with prob. at least , the sample complexity is . ∎
5.1 Proof of Lemma 5.2
For , we have the following concentration lemma for estimation .
Claim 5.3**.**
Given any satisfying the three conditions in Theorem 1.2 and any and , there exists such that for random samples , with probability ,
[TABLE]
Proof.
Notice that . From the Chernoff bound in Lemma 2.1, suffices to estimate . ∎
Next we consider the effect of noise and .
Claim 5.4**.**
With probability over random samples in , .
Proof.
From Theorem 4.2, . Thus Claim 5.3 implies for with probability 0.99.
At the same time, because , with probability at least from the Markov inequality. This is also less than from the upper bound on .
We have
[TABLE]
By the Cauchy-Schwartz inequality, the cross term . From all discussion above,
[TABLE]
When is a small constant, it is at least . ∎
We set for convenience and bound it as follows.
Claim 5.5**.**
Given any small constant , , and for , .
Proof.
Notice that where such that
[TABLE]
We bound through
[TABLE]
Therefore we write
[TABLE]
Because and , . So
[TABLE]
On the other hand,
[TABLE]
which is less than
From all discussion above, . ∎
For sufficiently small and , by Markov inequality, we have the following corollary.
Corollary 5.6**.**
For sufficiently small constants and , with probability over random samples in , .
Finally we finish the proof of Lemma 5.2.
Proof of Lemma 5.2. We assume Claim 5.4 and Corollary 5.6 hold in this proof, i.e.,
[TABLE]
For a random sample , we bound
[TABLE]
This is . Thus with probability , is less than . From all discussion above, with probability 0.6. ∎
6 Bounds on Fourier-sparse Signals
We consider where each in this section. The main result is to prove and for arbitrary real frequencies. We restate Theorem 1.5 after fixing .
Theorem 6.1**.**
Given and , let be a -Fourier-sparse signal centered around : where and be the observable signal on , where for a sufficiently small constant .
For any , there exist and an efficient algorithm that takes samples from and outputs satisfying with probability at least .
The main improvement is a biased distribution that saves the sample complexity from to .
{version:full}
We provide the main technical lemma here and defer the proofs of Theorem 1.3, 1.4, and 6.1 to Appendix 8.
{version:short}
We provide the main technical lemma here and defer the proofs of Theorem 1.3, 1.4, and 6.1 to the full version.
Theorem 6.2**.**
Given with , there exists a degree polynomial satisfying
* for each .* 2. 2.
Coefficients , and .
Corollary 6.3**.**
Given any and , there exist and a sequence of coefficients such that
* for any .* 2. 2.
For any (not necessarily in ), .
Proof.
Given , we set and apply Theorem 6.2 to obtain coefficients . Then we set . It is straightforward to verify the second property because of
[TABLE]
∎
The proof of Theorem 6.2 requires the following bound on the coefficients of residual polynomials, which is stated as Lemma 5.3 in [5].
Lemma 6.4**.**
Given , for any integer , let denote the residual polynomial of . Then each coefficient in is bounded: for and for .
We finish the proof of Theorem 6.2 here.
Proof.
Let be a large constant and . We use to denote the following subset of polynomials with bounded coefficients:
[TABLE]
For each polynomial , we rewrite as
[TABLE]
The coefficient is bounded by
[TABLE]
Then we apply the pigeonhole principle on the polynomials in after module : there exist polynomials such that each coefficient of is small from the counting
[TABLE]
Because , there exists and such that the lowest monomial with different coefficients in and satisfies . Eventually we set
[TABLE]
to satisfy the first property . We prove the second property in the rest of this proof.
We bound every coefficient in \big{(}z^{-l}\mod\prod_{j=1}^{k}(z-z_{j})\big{)}\cdot\big{(}P_{j_{1}}(z)-P_{j_{2}}(z)\mod\prod_{j=1}^{k}(z-z_{j})\big{)} by
[TABLE]
which is less than from Lemma 6.4 and the above discussion.
On the other hand, the constant coefficient in z^{-l}\cdot\big{(}P_{j_{1}}(z)-P_{j_{2}}(z)\big{)} is at least because is the smallest monomial with different coefficients in and from . Thus the constant coefficient of is at least .
Next we upper bound the sum of the rest of the coefficients by
[TABLE]
which demonstrates the second property after normalizing to 1. ∎
Acknowledgement
We thank Daniel Kane and Zhao Song for many helpful discussions. We also thank the anonymous referee for the detailed feedback and comments.
7 Properties of the Filter function
We show basic properties of our filter function in Appendix 7.1 and prove Theorem 4.2 in Appendix 7.2.
7.1 Properties of
We use two bounds on the function:
For any , . 2. 2.
For any , .
Without loss of generality, we assume both and are powers of 2 and (otherwise set ). Recall that is even in this section.
We use to denote the product of sinc functions in for convenience:
[TABLE]
We fix in this section and rewrite as
[TABLE]
Before we show the properties of , we consider the tail of .
Claim 7.1**.**
* for .* 2. 2.
* for .* 3. 3.
* for .* 4. 4.
For any , for any . 5. 5.
* for .*
Proof.
We first bound then bound \prod_{j=0}^{l}\operatorname{sinc}\big{(}2^{-j}\cdot C\cdot S\cdot t\big{)}^{2^{j}\cdot C}.
For , from the second property of functions,
[TABLE]
and
[TABLE] 2. 2.
For , from the first property of functions,
[TABLE]
Then we bound the tail of the product of sinc functions.
For ,
[TABLE]
Notice that is less than . Thus \operatorname{sinc}\big{(}2^{-j}\cdot C\cdot S\cdot t\big{)}^{2^{j}\cdot C}=\big{(}1-\Theta(2^{-j})\big{)}^{C} and their products over is
[TABLE] 2. 2.
Let us fix and consider \operatorname{sinc}\big{(}2^{-j}\cdot C\cdot S\cdot t\big{)}^{2^{j}\cdot C} for . By the first property of sinc function, for ,
[TABLE]
For , we use the same analysis with the second property of the sinc function:
[TABLE]
where is at least . Hence the product is
[TABLE]
We get the tail bounds by combining the above discussion of and \prod_{j=0}^{l}\operatorname{sinc}\big{(}2^{-j}\cdot C\cdot S\cdot t\big{)}^{2^{j}\cdot C} together. ∎
Since , we have the following bounds on based on Claim 7.1.
Lemma 7.2**.**
For any constant ,
. 2. 2.
* for .* 3. 3.
* for .* 4. 4.
* for of any .* 5. 5.
H(t)\leq s_{0}\cdot(\frac{1}{1.2\pi CR\cdot(|t|-\frac{1}{2})})^{C\log R}\cdot\big{(}\frac{1}{C\pi\cdot(|t|-\frac{1}{2})}\big{)}^{CS}* for .*
Proof.
We bound the integration of different intervals of as follows:
. 2. 2.
. 3. 3.
For any of ,
[TABLE] 4. 4.
For ,
[TABLE]
Next we prove all claims in this lemma.
For , notice that
[TABLE]
which also indicates . 2. 2.
When , , which is in 3. 3.
When , . 4. 4.
When ,
[TABLE] 5. 5.
When of a positive integer ,
[TABLE] 6. 6.
When , we use the bound in the last item of the above discussion.
∎
7.2 Proof of Theorem 4.2
We finish the proof of Theorem 4.2 using Lemma 7.2 for . Without loss of generality, we assume in this proof (otherwise set ).
We first show
[TABLE]
From the second property of in Lemma 7.2, H\big{(}\alpha x\big{)}\geq 1-0.01 for any such that
[TABLE]
At the same time, for any . This indicates
[TABLE]
The first property follows from these two inequalities.
In the rest of this proof, we apply Lemma 7.2 to prove:
[TABLE]
We split \int_{1}^{\infty}|g(x)\cdot H\big{(}\alpha x\big{)}|^{2}\mathrm{d}x into several intervals:
[TABLE]
In the first two terms, we rewrite as . By the third and fourth properties of in Lemma 7.2, their summations is less than . For the last term, given the last property of in Lemma 7.2 and a large constant , we have
[TABLE]
It is straightforward to verify that \int_{1}^{\infty}|g(x)\cdot H\big{(}\alpha x\big{)}|^{2}\mathrm{d}x\leq 0.02\cdot\|g\|^{2}_{2}.
The last property follows from the upper bounds in Lemma 7.2.
8 Omitted Proofs in Section 6
We first prove Theorem 1.5 then finish the proof of Theorem 1.3 and 1.4 in Appendix 8.2 and 8.3 separately.
8.1 Proof of Theorem 1.5
We finish the proof of Theorem 1.5 in this section. The only difference compared to Theorem 1.2 is to use a biased distribution such that we could improve the sample complexity to .
How to Generate Samples.
We will use a distribution not uniform on to generate the random samples. For samples , we assign a weight for each sample such that for any function ,
[TABLE]
[6] presented an explicit distribution such that samples could guarantee is close to with high probability. For completeness, we show it with our improved bound .
Lemma 8.1**.**
Given the sparsity , there exists a constant such that the distribution
[TABLE]
guarantees for any -Fourier-sparse signal , .
Moreover, samples from with weights for guarantee that, with probability at least ,
[TABLE]
Proof.
Given and the -Fourier-sparse signal , let denote for . We have \operatorname*{\mathbb{E}}_{x\sim D}\big{[}z(x)\big{]}=\operatorname*{\mathbb{E}}_{x\sim[-1,1]}\big{[}|g(x)|^{2}\big{]}=\|g\|_{2}^{2} and . We apply the Chernoff bound in Lemma 2.1 on the random variables to obtain the statement. ∎
Similar to Lemma 5.2, we state the following version for Fourier-sparse signals.
Lemma 8.2**.**
Given the sparsity and , let be a -Fourier-sparse signal with and .
Let be the observable signal on where the noise for a sufficiently small constant . There exist a constant and an algorithm such that for any , it takes samples to output satisfying with probability at least 0.6.
We show our algorithm in Algorithm 2. We finish the proof of Theorem 1.5.
Proof of Theorem 6.1. From Lemma 8.2, gives a good estimation of with probability 0.6 for any . We use the frequency search algorithm of Lemma 7.3 in [5] with the sampling procedure in Lemma 8.2. Because the algorithm in [5] uses the sampling procedure times to return a frequency satisfying with prob. at least , the sample complexity is . ∎
8.2 Proof of Theorem 1.3
We bound of -sparse-Fourier signals in this section. We first state the technical result to prove the upper bound .
Theorem 8.3**.**
Given any , there exists such that for any , any , and any ,
[TABLE]
Proof of Theorem 8.3. Given frequencies and , we set . Let be the coefficients of the degree polynomial in Theorem 6.2. We have
[TABLE]
Hence for every ,
[TABLE]
By Cauchy-Schwartz inequality, we have
[TABLE]
From the second property of in Theorem 6.2, |g(t)|^{2}\leq O(k)\cdot\bigg{(}\sum_{j=1}^{d}|g(t+j\cdot\theta|^{2}\bigg{)}. ∎
We finish the proof of Theorem 1.3 bounding by the above relation. For convenience, we restate it for .
Theorem 8.4**.**
For any ,
[TABLE]
Proof.
We prove
[TABLE]
which indicates |g(t)|^{2}=O(k^{3}\log^{2}k)\cdot\underset{x\sim[-1,1]}{\operatorname*{\mathbb{E}}}\big{[}|g(x)|^{2}\big{]}. By symmetry, it also implies that |g(t)|^{2}=O(k^{3}\log^{2}k)\cdot\underset{x\sim[-1,1]}{\operatorname*{\mathbb{E}}}\big{[}|g(x)|^{2}\big{]} for any .
We use Theorem 8.3 on :
[TABLE]
From all discussion above, we have . ∎
8.3 Growth outside of the observation
We prove Theorem 1.4 which bounds in this section. We divide the proof into two parts for and separately after fixing .
Lemma 8.5**.**
For any , there exists a constant such that for any , .
Remark 8.6**.**
The growth of Chebyshev polynomial is for .
Proof.
Let denote the length of the linear combination in Corollary 6.3 and . Given and , we use to denote the coefficients of the linear combination of in Corollary 6.3. For convenience, we use to denote the upper bound on the coefficients .
We use induction to prove that for some , for any ,
[TABLE]
For base case , from Corollary 6.3, where . Because each \big{|}g(x-j\theta)\big{|}\leq C\cdot k^{1.5}\log k\cdot\|g\|_{2} from Theorem 1.3, we have
[TABLE]
Suppose (7) is true for any . Let us consider . We still have where each . This indicates
[TABLE]
∎
For completeness, we bound the growth rate of here, which is a reformulation of Lemma 5.5 in [5].
Lemma 8.7**.**
For any and any ,
[TABLE]
Proof.
We fix in this proof. Let and n=\big{[}(t+1/2)/\theta\big{]} such that and . We first show the coefficients in
[TABLE]
satisfy g(t)=\sum_{l=0}^{k-1}C_{j}\cdot g\big{(}t-(n-l)\theta\big{)}. Let such that . For , we rewrite it as
[TABLE]
Thus .
Since , from [6]. On the other hand, from Lemma 6.4.
From all discussion above,
[TABLE]
∎
Proof of Theorem 1.4. We combine Lemma 8.5 and 8.7: For , . For , is still less than . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1AGS [03] A. Akavia, S. Goldwasser, and S. Safra. Proving hard-core predicates using list decoding. FOCS , 44:146–159, 2003.
- 2AKM + [19] Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, and Amir Zandieh. A universal sampling method for reconstructing signals with simple fourier transforms. In Proceedings of the 51st annual ACM symposium on Theory of computing (STOC 2019) , 2019.
- 3BM [86] Y. Bresler and A. Macovski. Exact maximum likelihood parameter estimation of superimposed exponential signals in noise. IEEE Transactions on Acoustics, Speech, and Signal Processing , 34(5):1081–1089, Oct 1986.
- 4Che [52] Herman Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. The Annals of Mathematical Statistics , 23:493–507, 1952.
- 5CKPS [16] Xue Chen, Daniel M. Kane, Eric Price, and Zhao Song. Fourier-sparse interpolation without a frequency gap. In Foundations of Computer Science(FOCS), 2016 IEEE 57th Annual Symposium on , 2016.
- 6CP [18] Xue Chen and Eric Price. Active regression via linear-sample sparsification. ar Xiv preprint ar Xiv:1711.10051 , 2018.
- 7GGI + [02] Anna C Gilbert, Sudipto Guha, Piotr Indyk, S Muthukrishnan, and Martin Strauss. Near-optimal sparse Fourier representations via sampling. In Proceedings of the thirty-fourth annual ACM symposium on Theory of computing , pages 152–161. ACM, 2002.
- 8GMS [05] Anna C Gilbert, S Muthukrishnan, and Martin Strauss. Improved time bounds for near-optimal sparse Fourier representations. In Optics & Photonics 2005 , pages 59141 A–59141 A. International Society for Optics and Photonics, 2005.
