Multi-Frequency Phase Synchronization
Tingran Gao, Zhizhen Zhao

TL;DR
This paper introduces a multi-frequency approach to phase synchronization, formulating it as a nonconvex optimization problem and developing an efficient algorithm that outperforms existing methods by leveraging harmonic retrieval techniques.
Contribution
The paper presents a novel multi-frequency formulation and a simple two-stage algorithm for phase synchronization, extending to general synchronization over compact Lie groups.
Findings
Algorithm significantly outperforms state-of-the-art methods
Utilizes multi-frequency information for improved accuracy
Achieves these results with only mild additional computational costs
Abstract
We propose a novel formulation for phase synchronization -- the statistical problem of jointly estimating alignment angles from noisy pairwise comparisons -- as a nonconvex optimization problem that enforces consistency among the pairwise comparisons in multiple frequency channels. Inspired by harmonic retrieval in signal processing, we develop a simple yet efficient two-stage algorithm that leverages the multi-frequency information. We demonstrate in theory and practice that the proposed algorithm significantly outperforms state-of-the-art phase synchronization algorithms, at a mild computational costs incurred by using the extra frequency channels. We also extend our algorithmic framework to general synchronization problems over compact Lie groups.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlind Source Separation Techniques · Direction-of-Arrival Estimation Techniques · Sparse and Compressive Sensing Techniques
Multi-Frequency Phase Synchronization
Tingran Gao
Zhizhen Zhao
Abstract
We propose a novel formulation for phase synchronization—the statistical problem of jointly estimating alignment angles from noisy pairwise comparisons—as a nonconvex optimization problem that enforces consistency among the pairwise comparisons in multiple frequency channels. Inspired by harmonic retrieval in signal processing, we develop a simple yet efficient two-stage algorithm that leverages the multi-frequency information. We demonstrate in theory and practice that the proposed algorithm significantly outperforms state-of-the-art phase synchronization algorithms, at a mild computational costs incurred by using the extra frequency channels. We also extend our algorithmic framework to general synchronization problems over compact Lie groups.
phase synchronization, spectral methods, cryo-EM, harmonic retrieval
1 Introduction
Angular or phase synchronization (Singer, 2011; Boumal, 2016) concerns estimating angles in from a subset of possibly noise-contaminated relative offsets . An instance of phase synchronization can be encoded on an observation graph , where each angle is assigned to a vertex and relative offsets are measured between and if and only if there is an edge in connecting vertices and . Equivalently, the angles can be encoded into a column phase vector , and measurements constitute a Hermitian matrix
[TABLE]
where is the adjacency matrix of the observation graph , is the entrywise product, and the Hermitian matrix encodes measurement noise.
As a prototypical example of more general synchronization problems arising from many scientific fields concerning consistent pairwise comparisons within large collections of objects (e.g., cryogenic electron microscopy (Singer et al., 2011) and comparative biology (Gao et al., 2019)), phase synchronization attracted much attention due to its simple yet rich mathematical structure. One mathematical formulation is through nonconvex optimization
[TABLE]
where is the Cartesian product of copies of . Depending on the context of the scientific problem, may be assumed to arise from an additive Gaussian noise model (Boumal, 2016; Bandeira et al., 2017), in which the Hermitian matrix in (1) is a Wigner matrix with i.i.d. complex Gaussian entries above the diagonal, or from a random corruption model (Singer, 2011; Chen et al., 2016) that assumes
[TABLE]
for each edge . Note that the random corruption model can also be cast in the form (1) after proper shifting and scaling. In general, the additive Gaussian noise model is more amenable to analysis, while the random corruption model is better at capturing the behavior of physical or imaging models where many outliers exist.
In this paper, we propose to tackle the phase synchronization problem by solving an alternative nonconvex optimization problem of “multi-frequency” nature, namely,
[TABLE]
where is the number of frequency channels, is the entrywise th power of , and is a Hermitian matrix containing information of the “true signal” in the th frequency component:
- •
For the random corruption model (3), we construct directly from by entrywise power:
[TABLE]
- •
For the additive Gaussian noise model, following (Bandeira et al., 2015; Perry et al., 2018), we assume
[TABLE]
where each is a complex Hermitian random matrix with independent upper diagonal entries, and the scaling is chosen such that the operator norm of is upper bounded by . Unlike (Bandeira et al., 2015; Perry et al., 2018), we allow entries of to be general sub-Gaussian random variables rather than restrictively complex Gaussian, and we do not assume independence of the ’s across different ’s.
We treat the two types of noise (5) (6) in a unified model, under which we design and analyze our multi-frequency phase synchronization algorithm. We demonstrate surprising theoretical and empirical results that drastically outperform all existing phase synchronization algorithms in their corresponding settings, measured in terms of the correlation between the output and the true phase vector , at a mild increase in the computational cost incurred by parallelizing the computation in frequency channels. As will be demonstrated in Section 4, in the noise regime where phase synchronization is tractable, the number of frequencies needed to outperform single frequency algorithms is at most polylogarithmically dependent on the problem size , while the estimation error decays polynomially in .
Motivation
The rationale behind the multi-frequency formulation (4) lies at the observation that statistical estimation can often benefit from higher moment estimates, even without introducing new measurements. As a motivating example, let be a complete graph, and consider the following coupled problems:
[TABLE]
where , and is generated according to (5). Up to rescaling by a factor of , and fit into model (6) with
[TABLE]
where are i.i.d. uniform on for , and . Note that and are by no means independent, but for all practical purposes satisfy the same sub-Gaussian bounds since and are identically distributed; we thus assume without loss of generality that . If we can find satisfying jointly
[TABLE]
then, by Lemma 1 of (Boumal, 2016) (assuming without loss of generality that ), we have for
[TABLE]
which gives , a tighter bound than one could obtain from (9) with alone, especially for large (with ).
The lesson we learn from this motivating example is that statistical estimation can benefit from leveraging higher-order moment information, even when the moment measurements are not essentially independent of each other. This is particularly prominent for the random corruption model, where all the “higher-order trigonometric moments” in , come from the first moments in by taking entrywise powers. In drastic contrast is the message-passing algorithm in (Perry et al., 2018), for which independence of the complex Gaussian Wigner noise ’s across the frequency channels play an essential role. The AMP approach was motivated by the non-unique games (NUG) framework in (Bandeira et al., 2015). Our algorithm follows an efficient two-stage paradigm (initialization and iterative refinement) popularized by recent progress in nonconvex optimization (see, e.g. (Candes et al., 2015; Chen & Candes, 2015)), and combines the trigonometric moments information across frequency channels in a manner akin to classical harmonic retrieval techniques in signal processing (Stoica & Moses, 1997; Tufts & Kumaresan, 1982; Bresler & Macovski, 1986; Ziskind & Wax, 1988; Schmidt, 1986; Roy & Kailath, 1989; Sorensen & De Lathauwer, 2017a, b) and the generalized power method (Boumal, 2016). This strategy easily extends to synchronization over general compact Lie groups, as illustrated in Section 5.
Notations
Upper case letters and lower case letters will be used to denote matrices and vectors, respectively. , are the transpose of with or without conjugation, respectively. The entrywise (Hadamard) product of matrix and will be denoted as . Graphs are always undirected and connected. Vertices of the graph will be denoted as integers ; pairs of integers denote edges in . For we write . Norms , stand for matrix or vector norms, depending on the context; , are matrix operator and Frobenius norms, respectively. The Cartesian product of copies of is denoted as . The quotient space is identified with the unit circle.
2 Related Work
Phase synchronization
Directly solving (2) is NP-hard (Zhang & Huang, 2006), but many convex and nonconvex methods have been proposed to find high quality approximate solutions. These include spectral and semi-definite programming (SDP) relaxations (Singer, 2011; Cucuringu et al., 2012; Chaudhury et al., 2015; Bandeira et al., 2016, 2017). An alternative approach using generalized power method (GPM) is also studied (Boumal, 2016; Liu et al., 2017; Zhong & Boumal, 2018).
Phase synchronization in multiple frequency channels
(Bandeira et al., 2015) proposed the non-unique games (NUG) SDP optimization framework for synchronization over compact Lie groups. The SDP is based on quadratically lifting the irreducible representations of the group elements, and imposing consistency among variables across frequency channels via a Féjer kernel; it is computationally expensive. (Perry et al., 2018) introduced an iterative approximate message passing (AMP) algorithm for noise model (6), assuming the noise are Gaussian and independent across frequency channels. Each iteration of the AMP performs matrix-vector multiplication and entrywise nonlinear transformation, followed by an extra Onsager correction term; it is conjectured to be asymptotically optimal.
3 Algorithm
In this section we formally state the two-stage multi-frequency phase synchronization algorithmic paradigm. Stage One combines phase synchronization outcomes from individual frequency channels with harmonic retrieval, aiming at producing a high-quality initialization; Stage Two iteratively refines an input by an extended generalized power method that works concurrently in multiple frequency channels while striving to maintain entrywise consistency.
3.1 Stage One: Initialization Strategy
Our algorithm takes as input Hermitian measurement matrices , , arising from the general sub-Gaussian model (6) (which includes (5) as a special case). This stage can be divided into three steps.
Step 1. Individual Frequency Synchronization: Apply any phase synchronization algorithm (spectral/SDP relaxation or GPM) to get phase vector estimate from each , , and form ;
Step 2. Entrywise Harmonic Retrieval: For each , use any harmonic retrieval technique to estimate from , , call the estimators ;
Step 3. Final Phase Synchronization: Construct another Hermitian matrix by , and apply any phase synchronization algorithm to estimate the true phases from matrix .
The flexibility of the multi-frequency phase synchronization framework lies at the various choices to be made in each step. As a concrete example, we detail in Algorithm 1 a simple version that uses spectral relaxation for phase synchronization and periodogram-based harmonic retrieval. We will henceforth refer to Algorithm 1 as the periodogram peak extraction with spectral methods (PPE-SPC). If a different phase synchronization method is used, for instance, SDP relaxation, our nomenclature refers to it as PPE-SDP. We will focus on analyzing PPE-SPC in depth in Section 4, but the analysis strategy can be seamlessly carried in principle to other variants of this algorithmic paradigm.
We briefly motivate the argmax operation in Step 2 as follows. If our measurement matrices are noise-free, then the th entry of from Step 1 should equal to ; in this case, the goal of Step 2 is to reconstruct from its “trigonometric moments,” for which any harmonic retrieval technique can be applied; the periodogram method in Algorithm 1 is among the most naïve approach for this purpose. For clean signal, the periodogram is equal to the modulus of the Dirichlet kernel
[TABLE]
which attains its maximum at . Since the peak of becomes sharper and sharper as increases, we expect the periodogram peak identification step to be robust to noise, which will produce a very high quality estimate for Step 3. In fact, our analysis in Section 4 suggests that this initialization stage alone can produce highly accurate phase vectors for sufficiently large , and the estimation error drops inverse-polynomially in .
3.2 Stage Two: Iterative Refinement
In this stage, we use an iterative refinement scheme that takes an initial phase vector and enhances it successively. In our implementation we warm-start this iterative algorithm with the produced from the PPE-SPC Algorithm 1, but any initialization scheme can be applied in principle, including random initialization. This iterative refinement concurrently performs the generalized power method (GPM) (Boumal, 2016) in multiple frequency channels consistently: at each frequency , we perform power iteration by multiplication with ; the results are combined across frequency channels to obtain one periodogram for each vertex followed by a “soft harmonic retrieval” step that soft-thresholds (Donoho, 1995) the periodogram in frequency domain. We pick a relatively lower threshold at the beginning of this iterative scheme, but gradually raise the threshold over to reveal the true peak that persists. Details can be found in Algorithm 2, henceforth referred to as multi-frequency generalized power method (MFGPM).
MFGPM can be viewed as an iterative version of PPE-SPC, except that the stringent peak extraction step is replaced with the more malleable soft-thresholding. Periodograms are virtually the Dirichlet kernels, which truncate a Dirac delta function in the frequency domain; one can also take Cesáro means of these periodograms, or equivalently, work with the Féjer kernels that are known to converge faster to the Dirac delta function. We omit those results as no significant difference is observed in performance.
As an integral part of our two-stage algorithmic framework, MFGPM works most efficiently with initialization from PPE-SPC, but we also observed empirically that the MFGPM outperforms other methods given identical random initialization, illustrated in Figure 1, in the sense that MFGPM often produces phase vectors that correlate more strongly with the true phase vector . See Section 6 for more comprehensive comparisons results.
The computational complexities of PPE-SPC and MFGPM are and , respectively.
4 Analysis
In this section we analyze PPE-SPC in theory, under the general sub-Gaussian noise model (6). We assume the observation graph is generated from a Erdős–Rényi model with edge connectivity independent of the ’s.
Assumption 1**.**
For and each , assume
[TABLE]
where is the entrywise th power of , and , are complex random Wigner matrices satisfying the following assumptions:
- (1)
*For any fixed , are jointly independent with zero mean, and unit sub-Gaussian norm (Vershynin, 2018); * 2. (2)
* for all and ;* 3. (3)
* for all and .*
Furthermore, assume is the adjacency matrix of a Erdős–Rényi random graph independent of all the ’s, with edge connecting probability .
We emphasize again that Assumption 1 assumes no independence for the ’s across frequency channels; only entries within the same are assumed independent. As explained in Introduction, this enables us to unify our discussions on the random corruption model and additive Gaussian model in a single pass (see e.g., (8)). Another advantage for such generality is that we can focus on analyzing complete observation graphs, since
[TABLE]
where is the identify matrix of dimension -by-, and thus we can apply the theoretical analysis in this section to where
[TABLE]
satisfies the same conditions as in Assumption 1 with different absolute constants. Therefore, in the rest of this section we focus on complete observation graph only, i.e.,
[TABLE]
Our first goal is to understand the spectral method in PPE-SPC Step 1 and Step 3. Since Step 2 is entrywise, it is crucial to bound the distance between and the leading eigenvector (scaled to ). The proof of the following Lemma 1 uses recent perturbation results of eigenvectors of random matrices (Eldridge et al., 2017; Abbe et al., 2017; Fan et al., 2018; Zhong & Boumal, 2018) and can be found in the supplemental material.
Lemma 1**.**
Assume Assumption 1 is satisfied, and the observation graph is a complete graph. Let be an arbitrarily chosen but fixed absolute constant. For any , denote for the leading eigenvector of scaled such that and . There exist absolute (in particular, independent of and ) constants such that, if , there holds with probability
[TABLE]
The inequality (15) is a direct consequence of (14), which is identical to Theorem 8 of (Zhong & Boumal, 2018), but we verify in the proof that the event probability in (Zhong & Boumal, 2018) can be made slightly higher. This is necessary for taking the union bound across all entries in the main Theorem 2.
A quick consequence of Lemma 1 is the uniform proximity of the periodogram to a Dirichlet kernel up to constant scaling and shifts, with high probability. More specifically,
[TABLE]
with probability . Clearly, the maximum of is attained at . We thus expect the argmax operation in Step 2 of PPE-SPC to produce high accuracy estimates of as long as the difference between the “optimization landscape” of the periodogram and the Dirichlet kernel is small enough. This is formalized in the following lemma, which exploits the geometry of the Dirichlet kernel.
Lemma 2**.**
Under the same conditions as in Lemma 1, if
[TABLE]
then with probability at least
[TABLE]
It is straightforward to check that (16) holds for sufficiently large as long as is bounded from above by . This can be seen by noticing that the function is differentiable and monotonically decreasing for all , and for sufficiently large it infinitesimally approaches .
The most important message from Lemma 2 is the following: At the beginning of the Step 3 of PPE-SPC, the newly constructed Hermitian matrix is entrywise –close to the ground truth rank-one matrix . We emphasize that this error incurred in is significantly smaller than the noise level in the raw input data, and can be made arbitrarily small by choosing large . We formalize this key observation in the main theorem below, for which the proof is deferred to the supplemental material.
Theorem 2**.**
Under the same conditions as Lemma 1 and Lemma 2, if (16) holds and , then there exists an absolute constant such that, with probability , the correlation between the true phase vector and the leading eigenvector (scaled to ) of in PPE-SPC Step 3 is at least
[TABLE]
provided that
[TABLE]
Moreover, for the phase vector output from PPE-SPC,
[TABLE]
Following the discussion after Lemma 2, it is not surprising to see in Theorem 2 that the correlation can be made arbitrarily close to (or equivalently, the distance between the estimated and true phase vectors can be made arbitrarily close to [math]). Moreover, it doesn’t take excessively large for PPE-SPC to outperform all existing phase synchronization algorithms—in fact, for which is the highest level of noise tolerable to ensure the validity of Lemma 1, it suffices to take to suppress the estimation error below the established near-optimal bound for eigenvector based phase synchronization methods (Bandeira et al., 2017; Zhong & Boumal, 2018). We believe (18) can still be improved by a factor of by leveraging the randomness in the residue error in (15), but such finer analysis relies on more detailed analysis on the perturbation and the change in the optimization landscape, which will be pursued in a future work.
5 Extension to General Synchronization
The algorithmic framework of multi-frequency phase synchronization proposed in this paper can be extended to synchronization over any compact Lie group , by the representation-theoretic analogue of Fourier series — the Peter–Weyl decomposition. In a nutshell, the Peter–Weyl theorem states that, for square integrable functions , we have decomposition
[TABLE]
where each is an irreducible, unitary representation of , and is the “Fourier coefficient”
[TABLE]
where the integral is take with respect to the Haar measure.
On a connected observation graph , the input data to a synchronization problem over group are pairwise measurements on edges satisfying . The goal is to find group elements , one for each vertex, that satisfy as many constraints as possible. Mathematically, this type of problems can often be formulated as an optimization problem (Bandeira et al., 2015)
[TABLE]
where each measures the compatibility between the relative alignment and the observation data on edge . The ’s are nonlinear and nonconvex in general. If are bandlimited, we can expand (21) using the Peter–Weyl decomposition
[TABLE]
which can be viewed as a generalization of the multi-frequency phase synchronization problem (4).
For simplicity of statement, we assume the observation graph is complete in this section. Since ’s are unitary representations, the matrices ’s are unitary matrices for any , and it is natural to solve for from its irreducible representations . Vertically stacking the th irreducible representations together, the variable can be organized in matrices , defined by
[TABLE]
Analogies of the noise models also exist in this more general setting. The additive Gaussian noise model, following (Perry et al., 2018), amounts to
[TABLE]
where the parameter stands for the signal-to-noise ratio (SNR) at “frequency ,” is a Wigner matrix with i.i.d. standard complex Gaussian entries in the upper triangular part. For the random corruption model, let
[TABLE]
and set the th sub-block of to .
As we elaborate in the remainder of this section, all the key ingredients in PPE-SPC and MFGPM can be extended to this more general setting. We demonstrate the efficacy of this algorithm for synchronization in Section 6.
**Spectral relaxation: ** Compute the top eigenvectors and stack them horizontally to form . Approximate with .
**Generalized harmonic retrieval: ** For each , set
[TABLE]
Based on these new estimates for the pairwise alignments, we build matrix with blocks with . We then extract the top eigenvectors of , stack them horizontally to form , and project each of its vertical blocks to a unitary matrix through singular value decomposition (SVD)
[TABLE]
**Iterative refinement: ** At the th iteration, denoting for the current stacked th representations (22), we construct
[TABLE]
and compute the inverse Fourier transform for each of the vertical sub-blocks of :
[TABLE]
Note that we only need toe evaluate on a finite number of uniformly sampled elements of , from which the “inverse Fourier transform” can be applied
[TABLE]
along with the soft-thresholding . We again project each to the closest unitary matrix by SVD (26), then form by vertically stacking the ’s. The final outputs are for .
6 Numerical Experiments
This section contains detailed numerical results under both additive Gaussian noise and random corruption models, for both and . In all experiments with Gaussian noise, we keep where is the signal-to-noise ratio (SNR); for the random corruption model (3) we set . We fix and vary and to evaluate and compare the performance of different algorithms. When comparing iterative algorithms (AMP, GPM, MFGPM), within each random trial the random initialization is kept identical for all three algorithms and across frequency channels; between trials both data and initialization are redrawn. The remainder of the section contains results for and synchronization with complete observation graphs only; incomplete observation graph results are similar and included in the supplemental material.
** synchronization: ** In Figure 2 and Figure 3, we measure the correlation between the output and the truth phase vector for various single- and multi-frequency synchronization methods, under the additive Gaussian and random corruption noise model, respectively. The SNR varies between and , which is in the extremely noisy regime: under the random corruption model, for instance, with , between and of the pairwise alignments are corrupted with random elements. In each subplot, the vertical axis varies from to , and the horizontal axis marks the change in . The bottom row in each subplot thus represents the single-frequency () version of the algorithm. The methods under comparison are: (a) AMP (Perry et al., 2018) with random initialization; (b) PPE-SPC; (c) MFGPM with random initialization; (d) PPE-SDP (replacing the spectral methods in Algorithm 1 with SDP relaxation); (e) PPE-SDP with an additional projection to rank-one matrices in each iteration; (f) Iterating PPE-SPC three times; (g) AMP initialized with PPE-SPC; (h) MFGPM initialized with PPE-SPC.
It is clear from Figure 2 and Figure 3 that leveraging information in multiple frequency channels produces superior results than single-frequency approaches. Most shockingly, in Figure 3 our proposed PPE-SPC method and variants [subplots (b)–(h)] are capable of recovering the true phase vector when the SNR is well below the critical threshold (corresponding to ) determined in (Singer, 2011) by random matrix arguments. This is surprising because, according to (Singer, 2011), for single frequency phase synchronization one can not expect correlation to be much higher than , which is in our experiments. This is confirmed by looking at the bottom row of each subplot of Figure 3, but with suitably large this barrier no longer exists, even though in model (5) our high-frequency measurements are generated from the single frequency data.
In Figure 2 and Figure 3, (d) and (e) illustrates the performance of the SDP variant of PPE-SPC. The difference between (d) and (e) is the following: in (d) we use directly estimated by solving the SDP in (Singer, 2011), but in (e) we apply project the SDP solution to a rank-one matrix using eigen-decomposition. The results from these SDP variants are occasionally slightly better PPE-SPC, but the computational cost is expensive: the runtime is over times longer, and a lot more memory is required. The SDP relaxation in (Bandeira et al., 2015) is even more demanding on computation resources so is not included here.
Figures 2f and 3f explore another possibility of extending PPE-SPC: After recovering , take entrywise powers of and treat them as multi-frequency data input to another fresh run of PPE-SPC. Unlike the iterative refinement algorithm MFGPM, we observed empirically that the performance boost saturate quickly after just a couple of such repeated calls to PPE-SPC. The result in (f) from both figures are obtained from performing such repetitions. Compared with (b), this strategy improves the estimation accuracy for smaller , but the performance gain is not as significant as using MFGPM for iterative refinements (h).
Initialization turns out to be important for AMP: As shown in Figure 2a, when the SNR is below the critical threshold predicted in (Perry et al., 2018) (), increasing does not lead to performance improvement; the critical threshold appears even higher for random corruption model (Figure 3a). In contrast, PPE-SPC and MFGPM can always benefit from sufficiently larger .
** synchronization: ** Comparison results for synchronization under Gaussian noise model and random corruption model are shown in Figure 4a and 4b, respectively. In all these experiments, the Fourier transform (27) is numerically evaluated using elements uniformly sampled in . Clearly, the proposed method outperforms single frequency methods and achieve higher accuracy as increases; moreover, the multi-frequency formulation and algorithm lead to drastic performance boost especially at the “low SNR regime.”
In Figure 5 we compare AMP and MFGPM with different initialization strategies–PPE-SPC vs. random initialization–under the additive Gaussian noise model (23) with . We plot the accuracy of using PPE-SPC alone without iterative refinement as a baseline. The results demonstrate the performance boost from using PPE-SPC for initialization, as well as improvements gain from using iterative refinements on top of the initialization PPE-SPC.
7 Conclusion
In this paper, we propose a novel, mult-frequency formulation for phase synchronization as a nonconvex optimization problem, for which we develop a two-stage algorithm inspired by harmonic retrieval and generalized power method that produces high accuracy approximate solutions. We demonstrate in theory and experiments that the new framework significantly outperform all existing phase synchronization algorithms.
There are many opportunities for future research. We are particularly interested in gaining deeper theoretical understandings for the multi-frequency GPM algorithm, especially its performance guarantees and behavior near local optimum. More general harmonic retrieval techniques can be potentially used in place of the periodogram-based peak extraction. We are also working on extending the algorithmic framework beyond compact Lie groups, such as Euclidean groups and symmetric groups, with applications to object matching (Shen et al., 2016; Pachauri et al., 2013).
Acknowledgements
Tingran Gao acknowledges support from an AMS-Simons Travel Grant and partial support from DARPA D15AP00109 and NSF IIS 1546413.
Appendix A Technical Proofs
A.1 Proof of Lemma 1
Proof of Lemma 1.
The conclusion of this lemma is identical to Theorem 8 of (Zhong & Boumal, 2018); the only difference is that the event probability is slightly larger — in Theorem 8 of (Zhong & Boumal, 2018) the event probability is . This can be done by straightforwardly modifying the arguments in the proof of the Theorem 8 of (Zhong & Boumal, 2018), and at the expense of increasing the absolute constant picked in that proof. Actually, this is already stated by the authors of (Zhong & Boumal, 2018) on page 998 of the published version, in the paragraph right below their Theorem 5. We document here how this modification can be done.
The randomness in the proof of Theorem 8 of (Zhong & Boumal, 2018) arises only from the dependence of Lemma 9 and Lemma 10 of (Zhong & Boumal, 2018), so it is sufficient to track the failure probability of the events there. These modifications only need to be stated for real sub-Gaussian random variables, as the trivial passage from real to complex cases is the same as detailed in the proof of Lemma 9 of (Zhong & Boumal, 2018).
Lemma 9 of (Zhong & Boumal, 2018) is based on the well-known concentration results on the maximum singular value of sub-Gaussian random matrices, in particular, Proposition 2.4 of (Rudelson & Vershynin, 2010), which states for any sub-Gaussian random matrix of dimension -by- with independent, zero mean sub-Gaussian entries (whose subgaussian moments are bounded by ) that, for any ,
[TABLE]
where are positive absolute constants. We take here , so with probability at least . Obviously, there exists sufficiently large absolute constant such that
[TABLE]
where is the arbitrarily chosen but fixed constant in the statement of our Lemma 1.
Lemma 10 of (Zhong & Boumal, 2018) attains the event probability by taking a union bound, over instances of and instances of , for individual event probabilities of , where is an absolute positive constant. However, note that in the case of eigenvectors, we have (consisting of a singleton, cf. the second paragraph on pp.1000 of (Zhong & Boumal, 2018), right above section title “Introducing auxiliary eigenvector problems”), which is two orders of magnitude smaller than the bound stated in Lemma 10 of (Zhong & Boumal, 2018). The union bound thus yields the success probability of at least , which is .
Combining both ends lead to the success probability of for any .
For the last inequality, note that , , and , and note that for all and . We have
[TABLE]
where in the last inequality we used the assumption . ∎
A.2 Proof of Lemma 2
Proof of Lemma 2.
The proof starts with some elementary observations for the Dirichlet kernel , defined as
[TABLE]
Note the following (cf. Figure 6):
- (1)
is upper bounded by ; 2. (2)
vanishes at , for ; 3. (3)
A unique local maximum exists between each pair of consecutive zeros on .
Let be the local maximizer attaining the highest “side lobe” of between and in Figure 6. When , by Lemma 1, the periodogram will not exceed
[TABLE]
On the other hand, again by Lemma 1, the periodogram stays above
[TABLE]
Therefore, as long as the upper bound (29) is no greater than the lower bound (30), which one can check is satisfied if condition (19) in the state of the lemma holds, i.e., if
[TABLE]
then the peak location of the periodogram can occur nowhere other than within , which gives the conclusion
[TABLE]
with . This completes the proof.
∎
A.3 Proof of Theorem 2
Proof of Theorem 2.
First, we note that the second part of the theorem about follows directly from Proposition 1 of (Liu et al., 2017), as in the proof of Lemma 8 of (Zhong & Boumal, 2018).
Assuming for the moment that the key assumption in Lemma 2 is satisfied, namely, and have been chosen such that
[TABLE]
With a union bound over each of the estimated relative phases obtained at the end of the Step 2 of Algorithm 1, with probability at least we have for all
[TABLE]
and thus
[TABLE]
Therefore,
[TABLE]
where the last equality follows from bounding each entry of individually using the rightmost term in (32). (Note that by doing so we do not need any information on the randomness of .) By the Davis–Kahan Theorem in Lemma 11 of (Zhong & Boumal, 2018), as long as , which we know from (33) that can be guaranteed if , the angle between and satisfies
[TABLE]
where in the last inequality we used the fact that for all . Therefore, setting , we have
[TABLE]
Now we seek lower bound for and that satisfies (31) under the condition imposed in Lemma 1. Obviously, (31) is satisfied if
[TABLE]
Using the elementary inequality (Kroopnick, 1997)
[TABLE]
we know that a sufficient condition for (34) to hold is
[TABLE]
which is further equivalent to
[TABLE]
Note that for all we have , and thus . Therefore, a sufficient condition for (36) to hold is
[TABLE]
∎
Appendix B Extra Numerical Results
We consider the incomplete graph structure with vertices under Erdős–Renyi graph model and the edge connection probability for the following experiments. Figure 7 shows that Algorithm 1 (PPE-SPC) is also robust for incomplete graphs.
Figures 8 and 9 show the performance of our PPE-SPC and its variant PPE-SPC3 on complete graph with vertices.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Abbe et al. (2017) Abbe, E., Fan, J., Wang, K., and Zhong, Y. Entrywise eigenvector analysis of random matrices with low expected rank. ar Xiv preprint ar Xiv:1709.09565 , 2017.
- 2Bandeira et al. (2015) Bandeira, A., Chen, Y., and Singer, A. Non-unique games over compact groups and orientation estimation in cryo-EM. ar Xiv preprint ar Xiv:1505.03840 , 2015.
- 3Bandeira et al. (2016) Bandeira, A. S., Kennedy, C., and Singer, A. Approximating the little Grothendieck problem over the orthogonal and unitary groups. Mathematical Programming , 160(1-2):433–475, 2016.
- 4Bandeira et al. (2017) Bandeira, A. S., Boumal, N., and Singer, A. Tightness of the maximum likelihood semidefinite relaxation for angular synchronization. Mathematical Programming , 163(1-2):145–167, 2017.
- 5Boumal (2016) Boumal, N. Nonconvex Phase Synchronization. SIAM Journal on Optimization , 26(4):2355–2377, 2016.
- 6Bresler & Macovski (1986) Bresler, Y. and Macovski, A. Exact maximum likelihood parameter estimation of superimposed exponential signals in noise. IEEE Transactions on Acoustics, Speech, and Signal Processing , 34(5):1081–1089, 1986.
- 7Candes et al. (2015) Candes, E. J., Li, X., and Soltanolkotabi, M. Phase Retrieval via Wirtinger Flow: Theory and Algorithms. IEEE Transactions on Information Theory , 61(4):1985–2007, 2015.
- 8Chaudhury et al. (2015) Chaudhury, K. N., Khoo, Y., and Singer, A. Global Registration of Multiple Point Clouds Using Semidefinite Programming. SIAM Journal on Optimization , 25(1):468–501, 2015.
