Multi-target detection with application to cryo-electron microscopy
Tamir Bendory, Nicolas Boumal, William Leeb, Eitan Levin, Amit Singer

TL;DR
This paper introduces an autocorrelation-based method for multi-target signal estimation in noisy measurements, enabling recovery even at high noise levels where traditional detection fails, with applications to cryo-electron microscopy.
Contribution
It develops a novel autocorrelation analysis approach that allows signal estimation from noisy data without detection or clustering, extending capabilities to extreme noise conditions.
Findings
Method accurately estimates signals at high noise levels.
Autocorrelation relations enable signal recovery from long measurements.
Numerical experiments demonstrate effectiveness across various scenarios.
Abstract
We consider the multi-target detection problem of recovering a set of signals that appear multiple times at unknown locations in a noisy measurement. In the low noise regime, one can estimate the signals by first detecting occurrences, then clustering and averaging them. In the high noise regime however, neither detection nor clustering can be performed reliably, so that strategies along these lines are destined to fail. Notwithstanding, using autocorrelation analysis, we show that the impossibility to detect and cluster signal occurrences in the presence of high noise does not necessarily preclude signal estimation. Specifically, to estimate the signals, we derive simple relations between the autocorrelations of the observation and those of the signals. These autocorrelations can be estimated accurately at any noise level given a sufficiently long measurement. To recover the signals…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Multi-target detection with application to cryo-electron microscopy
Tamir Bendory
The Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
Nicolas Boumal
Department of Mathematics, Princeton University, Princeton, NJ, USA
William Leeb
School of Mathematics, University of Minnesota, Minneapolis, MN, USA
Eitan Levin
The Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
Department of Mathematics, Princeton University, Princeton, NJ, USA
Amit Singer
The Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
Department of Mathematics, Princeton University, Princeton, NJ, USA
Abstract
We consider the multi-target detection problem of recovering a set of signals that appear multiple times at unknown locations in a noisy measurement. In the low noise regime, one can estimate the signals by first detecting occurrences, then clustering and averaging them. In the high noise regime, however, neither detection nor clustering can be performed reliably, so that strategies along these lines are destined to fail. Notwithstanding, using autocorrelation analysis, we show that the impossibility to detect and cluster signal occurrences in the presence of high noise does not necessarily preclude signal estimation. Specifically, to estimate the signals, we derive simple relations between the autocorrelations of the observation and those of the signals. These autocorrelations can be estimated accurately at any noise level given a sufficiently long measurement. To recover the signals from the observed autocorrelations, we solve a set of polynomial equations through nonlinear least-squares. We provide analysis regarding well-posedness of the task, and demonstrate numerically the effectiveness of the method in a variety of settings.
The main goal of this work is to provide theoretical and numerical support for a recently proposed framework to image 3-D structures of biological macromolecules using cryo-electron microscopy in extreme noise levels.
1 Introduction
We consider the multi-target detection problem of recovering a set of signals that appear multiple times at unknown locations in a noisy measurement. Let be the sought signals and let be the observed data, where we assume is much larger than . Let count the number of signal occurrences whose first entry is positioned at . Each of those signals is chosen among according to some (possibly unknown) distribution over . If signal occurrences overlap, they interfere additively. With additive white Gaussian noise, the measurement model can be written as
[TABLE]
where denotes linear convolution, and indicates the number of occurrences of starting at , so that . Explicitly, with zero-based indexing,
[TABLE]
The goal is to estimate from . In parts of the paper, we focus on the case , called the homogeneous case; the case is called heterogeneous. This idealized setup appears in several scientific applications, including structural biology [13] (as we detail below), spike sorting [32], passive radar [20], and system identification [37].
In the low noise regime (small ), a valid strategy is to first detect the signal occurrences in (that is, estimate ), cluster them (that is, separate into ), and solve standard deconvolution problems. Crucially, we focus on the high noise regime, where reliable detection of signal occurrences is impossible [13, 3]. This limitation does not, however, preclude estimation of the signals , as we show in this paper. In this setting, we refer to as nuisance variables: knowing them would certainly help, but we do not aim to estimate them.
In order to recover the signals in the high noise regime, we use autocorrelation analysis. At any noise level, the autocorrelations of the observation can be estimated to any desired accuracy for sufficiently large . This computation is straightforward and requires only one pass over the data. The underlying principle is to relate the autocorrelations of the observation to the autocorrelations of .
Below we describe two generative models for . In these models, the relationship between the autocorrelations of and those of depends on only through their expected sums, that is, the expected total number of occurrences of each signal. To estimate the signals and occurrence counts from the computed autocorrelations, we solve a nonlinear least-squares problem as explained in Section 5.
The multi-target detection problem is an instance of blind deconvolution—a longstanding problem arising in a variety of engineering and scientific applications, such as astronomy, communication, image deblurring, system identification and optics; see [24, 41, 5, 2], to name a few. Different variants of the blind deconvolution problem have been thoroughly analyzed recently [4, 34, 33, 30, 35, 27]. In clear contrast to multi-target detection, these works focus on the low noise regime and aim to estimate both unknown signals (in our setting, this means estimating both ’s and ’s).
Models for target distribution
We consider two models for the distribution of signal occurrences in the observation, that is, for .
The well-separated model.
As a first setup, we allow any generative model for which meets the following separation requirement: is binary, and
[TABLE]
In words: the starting positions of any two occurrences must be separated by at least positions, so that their end points are necessarily separated by at least signal-free (but still noisy) entries in the data. Furthermore, we require that the last signal occurrence in is also followed by at least signal-free entries. This property ensures that correlating with versions of itself shifted by at most entries does not involve correlating distinct signal occurrences. Once is determined, for each position such that , one of the signals is selected independently at random, and accordingly we set . As a result, the only properties of that affect the autocorrelations of (for shifts up to ) are the total number of occurrences of the distinct signals: their individual and relative locations do not intervene. We detail this in Section 3.
The Poisson model.
If the separation condition is violated, more knowledge about the location distribution is necessary to disentangle the autocorrelations of . To that effect, we analyze a Poisson generative model.
Specifically, for each position , the number of signal occurrences starting at that position is drawn independently from a Poisson distribution with parameter , that is, for some parameter . Then, is split into by selecting signals among , independently at random following a fixed distribution over . It is possible for more than one occurrence of the same to start at position . As for the well-separated model, the autocorrelations of under this model depend weakly on and , essentially through the parameter : see Section 3.
Extensions
Extending the problem setup and autocorrelation analysis to signals in more than one dimension is straightforward: see the discussion in Section 4.4 and numerical experiments in Section 5.
Likewise, it is easy to extend the model to situations where the signal occurrences are sampled from a general distribution rather than from a finite set of choices . The finite setup corresponds to the following distribution for signal occurrences:
[TABLE]
where is a Dirac delta (a point mass) located at , and encodes the discrete distribution over . In the generalized setup, the goal is to estimate the distribution (possibly defined by a finite set of parameters). In particular, this allows for continuous distributions of targets. We adopt this perspective when deriving the autocorrelations in Section 3.
In the next section, we show how this flexibility allows us to model an important imaging problem in structural biology.
2 Connection with single-particle reconstruction via cryo-electron microscopy
Cryo-electron microscopy (cryo-EM) has recently joined X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy as a high-resolution structural method for biological macromolecules [18, 26, 10]. In a cryo–EM experiment, biological samples (e.g., macromolecules, viruses) are rapidly frozen in a thin layer of vitreous ice. The microscope produces 2-D tomographic images of the samples embedded in the ice, called micrographs. Each micrograph contains multiple tomographic projections of the samples at unknown locations and under unknown viewing directions. Importantly, the electron dose must be kept low to mitigate radiation damage, inducing high noise levels. The goal is to reconstruct 3-D models of the molecular structures from the micrographs. Since cryo-EM produces images of individual particles, it can elucidate multiple structures simultaneously. This is in clear contrast to X-ray and NMR, which aggregate information from an ensembles of particles.
Considering the extensions described in the previous section, we can phrase a simplified generative model for micrographs (the observation ) within our framework. Specifically, locations are chosen in the 2-D plane of the image, corresponding to : this is where the molecules are fixed in the plane of the ice layer. At each selected location, a 2-D tomographic projection of a molecule (a signal occurrence) is added in the observation . This signal is drawn from a probability distribution described by a discrete number of parameters which correspond to the sought 3-D structure, as follows.
The 3-D structure (the target parameter) can be expanded into a linear combination of basis functions (for example, spherical harmonics for the spherical part and Bessel functions for the radial part): the coefficients of this expansion are the unknowns. Then, a rotation is applied to the volume (to model viewing directions) according to a (possibly unknown) distribution of over (the group of 3-D rotations). With tomographic projection denoted by , is a random signal, related to the distribution of through:
[TABLE]
By further allowing itself to be a random signal as well, this model can also encode a mixture of structures in the biological sample. This mixture might also be continuous, corresponding to continuous conformational variability.
All contemporary methods for single particle reconstruction using cryo-EM split the reconstruction procedure into two main stages. The first stage, called particle picking, detects and extracts the particle projections from the micrographs. Given the projections, the second stage aims to reconstruct a 3-D model of the molecular structure, usually using an expectation-maximization algorithm [40]. Crucially, reliable detection of individual particles is impossible in a highly noisy environment. This fact has been recognized early on by the cryo-EM community. Particularly, in [22, 19], it was reasoned that particle picking is impossible for molecules below a certain weight (below 50 kDa).
Even if particle picking is feasible, procedures may be affected by model bias. Particle picking algorithms are typically based on correlating the micrograph with several templates. Areas highly correlated with some of the templates are assumed to contain projections: these yield the picked particles. Clearly, the picked particles depend heavily on the chosen templates: different templates may lead to different picked particles, hence, which is more problematic, to different 3-D structure reconstructions. The cryo-EM community is well aware of this potential pitfall [45, 23, 44], which was notably exemplified by the “Einstein from noise” experiment [42].
A recent work of the authors suggests a methodology to bypass particle picking and reconstruct the 3-D structure directly from the micrograph [13]. Based on autocorrelation analysis, it was shown that—at least in principle—the limits high noise regimes impose on particle picking do not necessarily translate into limits on 3-D reconstruction. The main goal of the present paper is to provide theoretical and numerical support for this approach, in a simplified setting.
In the next section, we introduce autocorrelation analysis in detail, focusing on the multi-target detection model. We mention that similar ideas in cryo-EM can be traced back to a seminal paper of Zvi Kam [25]. Kam proposed autocorrelation analysis for 3-D reconstruction, under the assumption of picked, perfectly centered, particles. Kam’s method has been extended and used in X-ray free electron lasers (XFEL) and cryo-EM [36, 28, 31, 46]. In order to investigate the computational and statistical properties of Kam’s method, a series of papers have studied a simplified model, called multi-reference alignment [8, 14, 6, 39, 7, 1]. We follow the same line of research by considering the multi-target detection as an abstraction to the application of reconstructing 3-D structures directly from the micrograph [13].
3 Autocorrelation analysis
In what follows, we consider autocorrelations of both the observation and of the signal occurrences in . As per our discussion of extensions, the signal occurrences may be sampled from a discrete set (as in (1.3)), or from a more general distribution. Accordingly, we define autocorrelations broadly for a random signal of length . For our purposes, this will be applied both to signal occurrences (of length ) and to (of length ).
For a random signal , the autocorrelation of order is given for any integer shifts by
[TABLE]
where the expectation is taken with respect to the distribution of . Indexing out of bounds is zero-padded, that is, for out of the range . Explicitly, the first-, second- and third-order autocorrelations are given by:
[TABLE]
Since autocorrelations depend only on the differences between indices, they obey the following symmetries:
[TABLE]
and
[TABLE]
In particular, for sampled from with probabilities as in (1.3), the autocorrelations of are given in explicit form in terms of those of the deterministic signals as:
[TABLE]
Explicit expressions of the autocorrelations for the more involved model of cryo-EM (2.1) are given in [13].
We are given one observation (one realization) of . Thus, we cannot compute the autocorrelations of exactly as they involve taking an expectation against the distribution of . However, by the law of large numbers, as grows to infinity the empirical autocorrelations of almost surely (a.s.) converge to the actual (population) autocorrelations of , that is,
[TABLE]
This provides a concrete means of estimating the quantities . In the remainder of this section, we relate the observables to the unknowns , first under the well-separated model, then under the Poisson model.
3.1 Autocorrelations under the well-separated model
Under the separation condition (1.2), the relation between autocorrelations of the observation and those of is particularly simple, as we now show. It is useful to introduce some notation: let denote the number of signal occurrences in , and let
[TABLE]
This is the fraction of entries of occupied by signal occurrences. The separation condition imposes . For the heterogeneous model (1.1), one can define
[TABLE]
where so that .
Owing to the separation condition, when correlating with shifted versions of itself for shifts in , any given occurrence of in is only ever correlated with itself, and never with another occurrence. As a result, the autocorrelations of depend on the corresponding autocorrelations of , the noise level and the density (which is a weak dependence on the support signal ). Specifically, we show the following identities in Appendix A:
[TABLE]
where and , and indices are in the range . Terms proportional to are due to noise. If is known, they can be handled easily. If is unknown, one can either estimate it form the data, or one can ignore the few entries of the autocorrelations that are affected by —one in and in , a relatively small number in both cases.
We show in Section 4.1 that and can be identified uniquely from the observed autocorrelations for the homogeneous case (all signal occurrences are the same).
3.2 Autocorrelations under the Poisson model
In this section, we give the expressions for the autocorrelations of the observed signal under the Poisson model. We first note that the expected total number of signals occurring in is equal to , and consequently the total lengths of all signals (including overlaps) divided by is equal to
[TABLE]
Therefore, in the large limit, similarly to the role of in the well-separated model (3.5), the parameter may be interpreted as the signal density.
In Appendix B, we prove the following expressions for the autocorrelations of under the Poisson model. As in the well-separated model, the observed autocorrelations do not depend on individual occurrences of the signal, but only on the distribution of itself, and the parameters and :
[TABLE]
Note that from those autocorrelations, it is easy to retrieve the autocorrelations of the well-separated model for all entries unaffected by . If is known, then it is true for all entries.
In the homogeneous case (), we show in Section 4.2 that and identify uniquely the signal , the Poisson parameter , and the noise level for generic .
4 Theory
We begin this section by showing that, in the homogeneous case (), under both the well-separated and the Poisson models, the first three observed autocorrelations identify the (deterministic) signal , the density parameter , and the noise level . Then, for the heterogeneous case (), we bound from above the number of signals that can be recovered from those autocorrelations as a function of the signal length . Finally, we briefly discuss multi-dimensional signals.
Before that, we start by showing that in the homogeneous case a deterministic signal is identified uniquely by its second and the third autocorrelations. Indeed, assuming and are nonzero, we can recover explicitly by
[TABLE]
for . If or are equal to zero, then and we can use that indication to shrink . This proves the following useful fact:
Proposition 4.1**.**
A deterministic signal is determined uniquely by and .
Note that the procedure described in (4.1) is not numerically stable: if or are close to 0, recovery of is sensitive to errors in the autocorrelations. In practice, we recover by fitting it to its autocorrelations using a nonconvex least-squares procedure, which is empirically robust to additive noise. In prior work, we have observed similar phenomena for the related problem of multi-reference alignment [14, 16, 1].
4.1 Guarantees for the homogeneous well-separated model
The observed moments and under the well-separated model do not immediately yield the autocorrelations of the signal ; rather, the two are related by the noise level and the signal density . We will show, however, that all parameters— and —are still identified uniquely by the observed moments of .
First, we observe that if the noise level is known, generally, one can estimate from the first two moments of the micrograph. The other direction is true as well: if is known, then one can estimate . The proof is provided in Appendix C.
Proposition 4.2**.**
Assume the separation condition (1.2) holds and (all signal occurrences in are identical, up to noise). If the mean of is nonzero, then
[TABLE]
meaning can be determined from (and vice versa) using the observables .
Using third-order autocorrelation information of , both the ratio and the noise can be determined simultaneously. For the following results, when we say that a result holds for a “generic” signal , we mean that the set of signals which cannot be determined by these measurements has Lebesgue measure zero. In particular, this means that we can recover almost all signals with the given measurements. The proof is provided in Appendix D.
Proposition 4.3**.**
Assume , and assume that the separation condition (1.2) holds. Then, the observed autocorrelations and determine the ratio and noise level uniquely for a generic signal . If , then this holds for any signal with nonzero mean.
From Propositions 4.1 and 4.3 we deduce the following:
Corollary 4.4**.**
Assume and . Under the separation condition (1.2), the signal , the ratio , and the noise level are determined from the first three autocorrelation functions of if either the signal is generic, or has nonzero mean and .
4.2 Guarantees for the homogeneous Poisson model
Similarly to the homogeneous well-separated model, the observed autocorrelations under the Poisson model identify and uniquely. Opposed to Proposition 4.3, these quantities can be computed explicitly. In Appendix E we prove the following result:
Proposition 4.5**.**
Under the homogeneous Poisson model, the signal , the noise level and the Poisson parameter are identified uniquely from the observed autocorrelations. In particular,
[TABLE]
where is indeed observable since
[TABLE]
4.3 Elementary limitations of the heterogeneous case
In the heterogeneous model (1.1), the unknowns are signals of length , together with their densities (3.6) (equivalently: the distribution and overall density ) and possibly the noise level . To estimate these parameters, we must collect at least as many independent equations. Within our framework, polynomial equations are provided by the observable autocorrelations, which correspond to mixed autocorrelations of the unknowns as per (3.3). In this section, following [16], we count how many equations the first three autocorrelations may provide in the best case (discounting symmetries). This leads to a straightforward information-theoretic upper bound on the number of signals which can be estimated, as a function of . This is only an upper bound, though a bound of the same type was shown to be tight in a similar setting [7]. The counting is based on the autocorrelations of the well-separated model. For entries independent of , the autocorrelations of the Poisson process contain the same information as those of the well-separated model. If is known, this holds true for all entries; see Section 3.2.
The first-order autocorrelation (3.7) provides one equation. For second-order autocorrelations (3.8), if is known we obtain equations with ranging from 0 to . If is unknown, we may disregard (the only entry affected by ) and still collect equations. Similarly, for third-order autocorrelations, (3.9) with such that includes all relevant entries for our purpose (this accounts for symmetries), providing equations in total. If we further exclude any entries such that or are zero to avoid the need to estimate , there are remaining entries.
Hence, if is known we collect
[TABLE]
equations, while if it is unknown and we choose not to estimate it, then we collect
[TABLE]
equations in total. Of course, there may be redundancy in these equations: we aim only to provide an upper bound.
Since we aim to estimate parameters for the signals of length , plus parameters for the densities , there are unknowns. As a result, an absolute upper bound on such that the estimation problem may be solvable is
[TABLE]
for the case of known, and
[TABLE]
for the case of unknown and not estimated. Overall, this indicates that, at best, approximately signals and their densities can be recovered from the first three mixed autocorrelations. Based on related results in [7], we expect that as many as signals can indeed be estimated, though possibly not with computationally tractable estimators.
4.4 Autocorrelations in higher dimensions
Autocorrelations in dimensions are defined for as
[TABLE]
Interestingly, for the homogeneous case () in dimensions greater than one, almost all signals are determined uniquely from their second-order autocorrelation, up to two symmetries: sign (or phase for complex signals) and reflection through the origin (with conjugation in the complex case) [21]. If the mean of the signal is available and non-zero, the sign symmetry can be resolved. However, determining the reflection symmetry still requires additional information, beyond the second-order autocorrelation. The case of 1-D signals is fundamentally different: generally there are signals with the same second-order autocorrelation (after eliminating symmetries) [11, 12].
This uniqueness result for two and three-dimensional signals is the basis of a popular imaging technique called coherent diffraction imaging (CDI). In CDI, an object is illuminated with a coherent wave, and the far field diffraction pattern is measured, corresponding to the object’s Fourier magnitude [38, 43]. If the diffraction pattern is over-sampled by at least twice the Nyquist frequency, the data is equivalent to the signal’s second-order autocorrelation. In this context, the computational problem of recovering the signal from its second-order autocorrelation is usually referred to as phase retrieval or the phase problem. However, for 2-D images it has been shown recently that the problem is ill-conditioned unless the support of the image is known exactly [9]. That is, there exist other images whose second-order autocorrelations agree up to machine precision.
5 Algorithms and numerical experiments
In this section, we present three numerical experiments. In the first two experiments, we consider the heterogeneous model (1.1). First, with a fixed noise level, we show how the estimation quality improves as the length of the observation grows. Second, we explore how many signals can be recovered as a function of their length, in an infinite data regime where effects of the noise have been averaged out. In the last experiment, we extend our model to 2-D signals. We run the experiments on a shared computer with 144 logical CPUs of type Intel(R) Xeon(R) CPU E7-8880 v3 2.30GHz and 755Gb of RAM; we use at most 72 of these CPUs. The code for all experiments is available at https://github.com/PrincetonUniversity/BreakingDetectionLimit.
In this section, to assess the quality of our reconstruction, we compare against the ground truth. In a realistic setting, the ground truth is of course not available, which raises the question of how one can validate the results. A common technique used in cryo-EM is to split the data in two halves, produce two independent reconstructions based on these halves, then to compare the two reconstructions. It is clear how the same technique can be applied to our setting.
5.1 Experiment 1
For the experiment depicted in Figure 1, we fix signals of length : see the three red signals in the first column. The first signal’s actual support has length 11 (rather than 21), which allows us to simulate the situation in which the support of the signal is overestimated.
Following (1.1), we generate an observation of length . Each of the three signals appears, respectively (and approximately), , and times in for a total of exactly occurrences, such that at least zeros separate any two occurrences of any signals according to (1.2). This is done by randomly selecting placements in , one at a time with an accept/reject rule based on the separation constraint (1.2) and locations picked so far. For each placement, one of the three signals is picked at random according to the proportions . Then, i.i.d. Gaussian noise with mean zero and standard deviation is added, to form the observed .
Visually, the noise dominates the signal to the point that it is challenging to detect occurrences. More precisely, the cross-correlations of even with the true signals presents peaks at essentially random locations, uninformative of the actual locations of the signal occurrences. Thus, we contend that it would be difficult for any algorithm to locate the signal occurrences, let alone to cluster them according to which signal appears where.
Our aim is to investigate how accurately we can estimate the signals as a function of the observation length. To this end, we consider a growing part of the observation . For each length, we compute autocorrelations on that part, then we go on to estimate the signals from these “mixed” quantities. In practice, the autocorrelations are computed on disjoint segments of of length and added up, without correction for the junction points. Segments are handled sequentially on a GPU, as GPUs are particularly well suited to execute simple instructions across large vectors of data. If multiple GPUs are available, segments can be handled in parallel.
Having computed the autocorrelations of interest, we estimate signals and coefficients which agree with the data. We choose to do so by running an optimization algorithm on the following nonlinear least-squares problem:
[TABLE]
where is the length of the sought signals and the weights are set to , where are the number of coefficients used for each autocorrelation order: , (weights could also be set in accordance with variance estimates as in [16]).
Setting (as is a priori desired) is problematic because the above optimization problem appears to have numerous poor local optimizers. Thus, we first run the optimization with . This problem appears to have few poor local optima, perhaps because the additional degrees of freedom allow for more escape directions. Since we hope the signals estimated this way correspond to the true signals zero-padded on either side to length , we extract from each one a subsignal of length that has largest -norm. This estimator is then used as initial iterate for (5.1), this time with . We find that this procedure is reliable for a wide range of experimental parameters. To solve (5.1), we run the trust-region method implemented in Manopt [15] from 10 different random initial guesses and keep the one with lowest cost function value. Manopt allows to treat the positivity constraints on coefficients . Notice that the cost function is a polynomial in the variables, so that it is straightforward to compute it and its derivatives.
To asses reconstruction quality, we report the relative root mean squared error between the estimated signals and the ground truth (up to permutation of the signals.) For the first signal (with the overestimated support), we also translationally aligned the estimate with the ground truth because its estimation is only possible up to shift.
In Figure 1, we find that the signals can be recovered with good accuracy despite the noise levels which seemingly hinders location and clustering. We also note that the amount of data required to produce these good estimations is large. Furthermore, as illustrated here and as we have observed in numerous experiments, signals with more variations (such as the third signal in this experiment which was generated once from a Gaussian distribution) are easier to estimate accurately than more regular signals (in this case, despite the fact that the third signal occurs less frequently than the others). This phenomenon has been also observed in multi-reference alignment [39, Section 3.2].
5.2 Experiment 2
In this second experiment, presented in Figure 2, we investigate how many distinct signals can be estimated from mixed autocorrelations (3.3). In order to do so, we consider a setup where the mixed autocorrelations are known perfectly. This corresponds to the limit of an infinitely long observation with fixed density and fixed noise level (that may be arbitrarily high). The specific value of is immaterial since we only consider autocorrelations that are unaffected by noise bias. Furthermore, we assume uniform occurrence distribution (known to the algorithm), and known density (which is then irrelevant as it only induces a global scaling of the autocorrelations of ).
To produce Figure 2, we consider each pair in turn, with and . For each pair, we generate random normal signals of length , once. The perfect mixed autocorrelations are computed. They are then provided to the inversion algorithm described in Section 5.1, together with the knowledge that signals occur with equal probability, as well as the correct density . The algorithm is initialized 50 times with an independent random initial guess, also following a normal distribution. For each run, we record three metrics:
Whether the optimization algorithm managed to produce a solution with cost function value below : this assesses whether optimization succeeded. 2. 2.
The relative root mean squared error between the estimated signals and the ground truth (up to permutation of the signals.) 3. 3.
The computation time in seconds (keeping in mind that the 50 runs are done in parallel on the same, shared computer, so that this is more of a qualitative assessment.)
These metrics are summarized and presented in Figure 2 as three panels.
Panel 1 shows for each pair which fraction of the 50 runs reached optimality (between 0 and 1). 2. 2.
For each pair , any estimator produced by the optimization algorithm such that the cost function value is close to zero must be considered a valid estimator, since it agrees almost perfectly with the data. For all of those, we compute the error compared to the ground truth. Panel 2 shows the largest such error, on a log scale in base 10. (If optimization never succeeded for that pair, we report a relative error of 1.) A large value means that, among all near global optima of the optimization problem (if any), at least one was a poor estimator. A small value indicates all computed near global optima were good estimators. 3. 3.
Median computation time (where the median is computed after taking the log of the CPU times, in base 10.)
Following [16], overlaid on the panels we trace the red curve as well as red dots which are computed from the considerations in Section 4.3 (adapted to the fact that the ’s are known). We observe that strictly above the red dots the optimization problem appears to be easy to solve (despite non-convexity), yet, as predicted, the corresponding estimators are not informative since there is not enough information in the computed quantities compared to the number of parameters. On the other hand, below the (empirical) red curve, the optimization problem is sometimes solved to optimality (although it may take more than one random initialization to get one successful run), and the corresponding estimators are accurate. In between the red curve and the red dots, the optimization problem appears to be particularly challenging: we essentially never produce a global optimum, hence we also do not have an estimator. This experiment suggests a possible computational-statistical gap in the area between the red curve and the red dots, where it is possible that the signals could be estimated, but perhaps not with a computationally efficient procedure. Similar results were observed for the multi-reference problem [16, 47, 7].
5.3 Experiment 3
Autocorrelation analysis can be carried out in dimensions greater than one. In the following experiment, we estimate a 50-by-50 pixel grayscale picture of Einstein with mean zero from a growing number of observations . Each observation is of size pixels and contains 700 occurrences on average at random locations, while maintaining the separation condition (1.2) in each axis separately. The observations are contaminated with additive white Gaussian noise with standard deviation , illustrated in Figure 3.
We compute the average second-order autocorrelation of the observations. This is a particularly simple computation which can be efficiently executed with a fast Fourier transform (FFT), in parallel over the numerous observations. We assume the number of signal occurrences (akin to the density ) and the standard deviation of the noise, , are known. Given those quantities, the second-order autocorrelation of the image can be easily deduced from (3.8). As explained in Section 4.4, an image is determined uniquely form its second-order autocorrelations. Then, to estimate the target image, we use a standard phase retrieval algorithm called relaxed-reflect-reflect (RRR) [17], initialized randomly.
Relative error is measured as the ratio of the root mean square error to the norm of the ground truth (square root of the sum of squared pixel intensities). Figure 4 shows several estimated images for a growing number of observations. Figure 5 presents the normalized recovery error as a function of the amount of data available. This is computed after fixing the reflection symmetries (see Section 4.4).
As evidenced by these figures, the ground truth image can be estimated increasingly well from increasingly many observations, without the need to locate the signal occurrences.
6 Summary
This paper suggests a computational framework for estimation under extreme noise levels. The crux of the method lies in the distinction between parameters of interest (the signals) and nuisance variables (parameters associated with individual signal occurrences, such as location and class). In part through theory and in part through numerical experiments, we show that estimating the signals is possible even when they cannot be detected in the data. The method consists of two steps. First, we estimate the autocorrelations of the observation. A key feature is that, for any noise level, these autocorrelations can be estimated to any desired accuracy given sufficiently rich observations. Second, we recover the signals from the autocorrelations. This recovery entails solving a system of low-order polynomial equations. While solving such systems is hard in general, we found that in the homogeneous case we can solve them explicitly, and in both the homogeneous and heterogeneous cases we can solve them with reasonable robustness through non-convex optimization, in a wide regime of parameters.
In addition, autocorrelation analysis provides a flexible framework to extend the multi-target detection model by relating the expected autocorrelations of the data with the signals, and all parameters necessary to describe the generative model. For instance, a follow-up paper relaxes the separation condition (1.2) and allows an arbitrary spacing of targets, as long as the signal occurrences do not overlap [29]. In that case, the autocorrelations of the data are functions of the signal and the unknown target distribution. In a similar fashion, one may include more realistic noise models, the effect of a blurring kernel (e.g., the point spread function of the microscope) and so on.
The prime motivation of this paper emanates from challenges in small particle reconstruction using cryo-EM. Small particles induce such low signal-to-noise ratio in the micrograph that particle picking—the first step in any current cryo-EM reconstruction algorithm—is impossible. The main message of our recent report [13] is that particle picking is merely a means to an end (although admittedly of key usefulness when it can be done): the locations and classes of individual particle projections are nuisance variables. The ultimate goal is only to estimate the 3-D structures. To this end, we used autocorrelation analysis to estimate the structure directly from the micrograph, without particle picking. In order to gain better understanding of the method, this paper focuses on an abstraction of cryo-EM—the multi-target detection model. Our next goal is to reconsider the full cryo-EM model, both from theoretical and algorithmic perspectives. In particular, the numerical results in [13] suggest that the achieved resolution using autocorrelations up to third order is limited by ill-conditioning of the system of polynomial equations. Higher resolution may require computing higher-order autocorrelations, which would increase the sample complexity and computational complexity of the algorithm. Despite the challenges, we believe that this approach may ultimately offer a way to reconstruct 3-D structures that are too small for current algorithmic pipelines.
Acknowledgment
The authors thank Ayelet Heimowitz, Joe Kileel, Ti-Yen Lan and Amit Moscovich for helpful discussions. The research is supported in parts by Award Number R01GM090200 from the NIGMS, FA9550-17-1-0291 from AFOSR, Simons Foundation Math+X Investigator Award, the Moore Foundation Data-Driven Discovery Investigator Award, and NSF BIGDATA Award IIS-1837992. NB is partially supported by NSF award DMS-1719558.
Appendix A Autocorrelations in the well-separated model
Let denote the (independent) realizations of the random signal in the observation , starting at (deterministic) positions . Let be the indicator variable for whether position is in the support of occurrence , that is, it is one if is in , and zero otherwise. Then,
[TABLE]
This gives a simple expression for the first autocorrelation of . Indeed,
[TABLE]
Now switch the sums over and , and observe that is zero unless for in the range . Hence,
[TABLE]
Since the noise has zero mean and are identically distributed, we further find:
[TABLE]
To address the second-order moments, we resort to the separation condition (1.2). First, consider this expression:
[TABLE]
The cross-terms vanish in expectation since is zero mean and independent from the signal occurrences. The last term vanishes in expectation unless since distinct entries of are independent. For , . Finally, using the separation property, observe that if is nonzero, then it is equal to one, and for some in . Then, switch the order of summations to get
[TABLE]
where and . Since each is distributed as , they all have the same autocorrelations as and we finally get
[TABLE]
We now turn to the third-order autocorrelations. These involve the sum
[TABLE]
Using (A.1), we find that this quantity can be expressed as a sum of eight terms:
2. 2.
3. 3.
4. 4.
5. 5.
6. 6.
7. 7.
8. 8.
Terms 2–4 and 8 vanish in expectation since odd moments of centered Gaussian variables are zero. For the first term, we use the fact that the separation condition implies
[TABLE]
(Otherwise, the product of indicators is zero.) This allows to reduce the summations over to a single sum over . Then, switching the order of summation with , we get that the first term is equal to
[TABLE]
In expectation over the realizations , using again that they are i.i.d. with the same distribution as , this first term yields . Now consider the fifth term. Taking expectation against yields
[TABLE]
Switch the order of summation over and again to get
[TABLE]
Now taking expectation against the signal occurrences yields . A similar reasoning for terms 6 and 7 yields this final formula for the third-order autocorrelations of :
[TABLE]
Appendix B Autocorrelations in the Poisson model
We will denote by the moment tensors of :
[TABLE]
[TABLE]
[TABLE]
We obtain the autocorrelations of by averaging over a slice of the moment tensors:
[TABLE]
[TABLE]
and
[TABLE]
We will make repeated use of the following elementary lemma:
Lemma B.1**.**
If , then
[TABLE]
B.1 Computing
We will first condition on the vector of locations of the subsignals in , and then average over . We will denote by the random vectors starting in . We have:
[TABLE]
Now taking expectations over and using we get:
[TABLE]
Consequently,
[TABLE]
B.2 Computing
First we will consider the noise-free case, where . We will condition on first, and then take the expectation over . Fix , and let . Then:
[TABLE]
We break up the double sum over and into two terms: one where , and one where or equivalently . In the first case, all the terms are independent, and so the expectation factors. In the second case, when we have independence, but otherwise not. This gives (all expectations are conditional on ):
[TABLE]
Now take expectations over the Poisson random variables, using Lemma B.1:
[TABLE]
For positive , we observe that any terms linear in the noise vanish in expectation. Denoting by the clean signal component of length , so , we have:
[TABLE]
We conclude by averaging over and with a fixed value of .
B.3 Computing
We will first assume . For three distinct , and , we let and . We have:
[TABLE]
We will break up the outer three sums into disjoint sums with the following ranges of indices:
and . 2. 2.
and . 3. 3.
and . 4. 4.
and and . 5. 5.
and and .
For Case 1, we have . We further break up the sum:
[TABLE]
For term (a), the expectation conditional on is:
[TABLE]
Using Lemma B.1, the unconditional expectation of (a) is then:
[TABLE]
For term (b), the expectation conditional on is:
[TABLE]
and then again using Lemma B.1 we get the expected value:
[TABLE]
Similarly, the expected values of terms (c) and (d) are:
[TABLE]
and
[TABLE]
Finally, the expected value of term (e) is easily shown to be:
[TABLE]
This concludes the computation for Case 1.
Moving onto Case 2, we have , and also define . By definition, . The sum is:
[TABLE]
Taking expectations conditional on , we then get:
[TABLE]
Taking expectations over and using Lemma B.1 then gives:
[TABLE]
Similarly, Cases 3 and 4 give the expressions:
[TABLE]
and
[TABLE]
Finally, in Case 5 we have , , and are all pairwise distinct. Consequently, the variables are always independent, and the expectation conditional on (letting , ),
[TABLE]
since the ’s are pairwise independent, the expectation over then yields:
[TABLE]
Now we add all the terms from Cases 1 to 5. Expressions (B.18), (B.26), (B.28), (B.30), and (B.33) sum to the expression:
[TABLE]
Expressions (B.20), (B.21), (B.22), (B.27),(B.29), and (B.31) sum to the expression:
[TABLE]
Finally, expression (B.23) is simply:
[TABLE]
Now when , we write , and
[TABLE]
We conclude by averaging over all , , and with fixed values of and .
Appendix C Proof of Proposition 4.2
Refer to equations (3.7)–(3.9) for expressions relating the moments of and those of , and the parameters and . First, note that
[TABLE]
Similarly, using :
[TABLE]
where the last equality is obtained by noting that, given the summation bounds, the set of pairs and are the same over the valid range . To conclude, notice that and combine.
Appendix D Proof of Proposition 4.3
We prove that both and are identifiable from the observed first three moments of . For convenience, we work with rather than itself. To this end, we construct two quadratic equations satisfied by and whose coefficients can be computed from observable quantities. Then, we show that these equations are independent, and hence that is uniquely defined. Given , we can estimate using Proposition 4.2.
Throughout the proof, it is important to distinguish between observed and unobserved values. We denote the observed values by and . We use to denote functions of the signal’s autocorrelations (which are not directly observable).
Recall that and , where is the vector of all-ones and is the 2-norm. Consider the product :
[TABLE]
where . The terms of can also be estimated from , while taking the scaling and bias terms into account. This yields another observable, :
[TABLE]
Therefore, from (D.1) and (D.2) we get:
[TABLE]
Let ; recall from Proposition 4.2:
[TABLE]
Plugging into (D.3) and rearranging, we get a first quadratic equation in ,
[TABLE]
where
[TABLE]
Importantly, these coefficients are observable quantities. As we assume throughout this proof that has nonzero mean, and we conclude that this equation is non-trivial.
Next, we derive the second quadratic equation for . We notice that
[TABLE]
where , and we can work out that:
[TABLE]
Once again, can be estimated from , taking bias and scaling into account:
[TABLE]
Consider the following ratio:
[TABLE]
From the latter, we deduce:
[TABLE]
Using (D.4) and rearranging, we get the second quadratic:
[TABLE]
where
[TABLE]
It is also non-trivial since .
To complete the proof, we need to show that the two quadratic equations (D.5) and (D.8) are independent. To this end, it is enough to show that the ratios between coefficients differ. From (D.5) and (D.1), we have:
[TABLE]
In addition, using (D.6),
[TABLE]
For contradiction, suppose that the quadratics are dependent. Then, , that is,
[TABLE]
Rewriting the identity in terms of and dividing by we get:
[TABLE]
For generic , this polynomial equation is not satisfied so that the quadratic equations are independent. Furthermore, from the inequality it follows immediately that the equations must be independent so long as
[TABLE]
Appendix E Proof of Proposition 4.5
We first note that , , and can be computed directly form the observed autocorrelations (3.11), (3.12) and (3.13). Indeed, recovering and is immediate from (3.11) and (3.12), and then follows from
[TABLE]
Let us assume that is generic. Indeed, we observe the product:
[TABLE]
Since we also observe , we can form the ratio and solve for .
[TABLE]
Then, similarly to Appendix C, one can solve for :
[TABLE]
Once and were computed, one can recover from and by Proposition 4.1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Emmanuel Abbe, Tamir Bendory, William Leeb, João M Pereira, Nir Sharon, and Amit Singer. Multireference alignment is easier with an aperiodic translation distribution. IEEE Transactions on Information Theory , 65(6):3565–3584, 2019.
- 2[2] Karim Abed-Meraim, Wanzhi Qiu, and Yingbo Hua. Blind system identification. Proceedings of the IEEE , 85(8):1310–1322, 1997.
- 3[3] Cecilia Aguerrebere, Mauricio Delbracio, Alberto Bartesaghi, and Guillermo Sapiro. Fundamental limits in multi-image alignment. IEEE Transactions on Signal Processing , 64(21):5707–5722, 2016.
- 4[4] Ali Ahmed, Benjamin Recht, and Justin Romberg. Blind deconvolution using convex programming. IEEE Transactions on Information Theory , 60(3):1711–1732, 2014.
- 5[5] GR Ayers and J Christopher Dainty. Iterative blind deconvolution method and its applications. Optics letters , 13(7):547–549, 1988.
- 6[6] Afonso Bandeira, Philippe Rigollet, and Jonathan Weed. Optimal rates of estimation for multi-reference alignment. ar Xiv preprint ar Xiv:1702.08546 , 2017.
- 7[7] Afonso S Bandeira, Ben Blum-Smith, Joe Kileel, Amelia Perry, Jonathan Weed, and Alexander S Wein. Estimation under group actions: recovering orbits from invariants. ar Xiv preprint ar Xiv:1712.10163 , 2017.
- 8[8] Afonso S Bandeira, Moses Charikar, Amit Singer, and Andy Zhu. Multireference alignment using semidefinite programming. In Proceedings of the 5th conference on Innovations in theoretical computer science , pages 459–470. ACM, 2014.
