Multi-Frequency Phase Synchronization

Tingran Gao; Zhizhen Zhao

arXiv:1901.08235·cs.IT·May 15, 2019

Multi-Frequency Phase Synchronization

Tingran Gao, Zhizhen Zhao

PDF

Open Access

TL;DR

This paper introduces a multi-frequency approach to phase synchronization, formulating it as a nonconvex optimization problem and developing an efficient algorithm that outperforms existing methods by leveraging harmonic retrieval techniques.

Contribution

The paper presents a novel multi-frequency formulation and a simple two-stage algorithm for phase synchronization, extending to general synchronization over compact Lie groups.

Findings

01

Algorithm significantly outperforms state-of-the-art methods

02

Utilizes multi-frequency information for improved accuracy

03

Achieves these results with only mild additional computational costs

Abstract

We propose a novel formulation for phase synchronization -- the statistical problem of jointly estimating alignment angles from noisy pairwise comparisons -- as a nonconvex optimization problem that enforces consistency among the pairwise comparisons in multiple frequency channels. Inspired by harmonic retrieval in signal processing, we develop a simple yet efficient two-stage algorithm that leverages the multi-frequency information. We demonstrate in theory and practice that the proposed algorithm significantly outperforms state-of-the-art phase synchronization algorithms, at a mild computational costs incurred by using the extra frequency channels. We also extend our algorithmic framework to general synchronization problems over compact Lie groups.

Equations119

H = A \circ [z z^{*} + Δ], \vspace - 0.02 in

H = A \circ [z z^{*} + Δ], \vspace - 0.02 in

x \in C_{1}^{n} max x^{*} H x \vspace - 0.09 in

x \in C_{1}^{n} max x^{*} H x \vspace - 0.09 in

H_{ij} = {z_{i} \overset{z}{ˉ}_{j} w \sim Unif (U (1)) with prob. r \in [0, 1] with prob. 1 - r

H_{ij} = {z_{i} \overset{z}{ˉ}_{j} w \sim Unif (U (1)) with prob. r \in [0, 1] with prob. 1 - r

x \in C_{1}^{n} max k = 1 \sum k_{max} (x^{k})^{*} H^{(k)} x^{k} \vspace - 0.06 in

x \in C_{1}^{n} max k = 1 \sum k_{max} (x^{k})^{*} H^{(k)} x^{k} \vspace - 0.06 in

H_{ij}^{(k)} = H_{ij}^{k}, k = 1, \dots, k_{max}; 1 \leq i, j \leq n .

H_{ij}^{(k)} = H_{ij}^{k}, k = 1, \dots, k_{max}; 1 \leq i, j \leq n .

H^{(k)} = A \circ [z^{k} (z^{*})^{k} + σ_{k} Δ^{(k)}],

H^{(k)} = A \circ [z^{k} (z^{*})^{k} + σ_{k} Δ^{(k)}],

x \in C_{1}^{n} max x^{*} H^{(1)} x; x \in C_{1}^{n} max (x^{2})^{*} H^{(2)} x^{2} \vspace - 0.06 in

x \in C_{1}^{n} max x^{*} H^{(1)} x; x \in C_{1}^{n} max (x^{2})^{*} H^{(2)} x^{2} \vspace - 0.06 in

σ_{k} Δ_{ij}^{(k)} = {(1 - r) z_{i} \overset{z}{ˉ}_{j} e^{ι k φ_{ij}} - r z_{i} \overset{z}{ˉ}_{j} with prob. r \in [0, 1] with prob. 1 - r

σ_{k} Δ_{ij}^{(k)} = {(1 - r) z_{i} \overset{z}{ˉ}_{j} e^{ι k φ_{ij}} - r z_{i} \overset{z}{ˉ}_{j} with prob. r \in [0, 1] with prob. 1 - r

(\overset{x}{^}^{k})^{*} H^{(k)} \overset{x}{^}^{k} \geq (z^{k})^{*} H^{(k)} z^{k} k = 1, 2

(\overset{x}{^}^{k})^{*} H^{(k)} \overset{x}{^}^{k} \geq (z^{k})^{*} H^{(k)} z^{k} k = 1, 2

\overset{x}{^}^{k} - z^{k}_{2}^{2} = 2 (n - ∣ z^{*} \overset{x}{^} ∣^{k}) \leq 16 σ^{2} ∥ Δ^{(k)} ∥_{2}^{2} / n,

\overset{x}{^}^{k} - z^{k}_{2}^{2} = 2 (n - ∣ z^{*} \overset{x}{^} ∣^{k}) \leq 16 σ^{2} ∥ Δ^{(k)} ∥_{2}^{2} / n,

Dir_{k_{max}} (θ_{ij} - θ)

Dir_{k_{max}} (θ_{ij} - θ)

= \frac{sin [ ( k _{max} + 1/2 ) ( θ _{ij} - θ ) ]}{sin [ ( θ _{ij} - θ ) /2 ]}

H^{(k)} = A \circ [z^{k} (z^{k})^{*} + σ Δ^{(k)}]

H^{(k)} = A \circ [z^{k} (z^{k})^{*} + σ Δ^{(k)}]

E H^{(k)} = p z^{k} (z^{k})^{*} - p I_{n} \vspace - 0.05 in

E H^{(k)} = p z^{k} (z^{k})^{*} - p I_{n} \vspace - 0.05 in

E^{(k)} = \frac{1}{p} {A \circ [z^{k} (z^{k})^{*} + σ Δ^{(k)}]} - z^{k} (z^{k})^{*} + I_{n} \vspace - 0.05 in

E^{(k)} = \frac{1}{p} {A \circ [z^{k} (z^{k})^{*} + σ Δ^{(k)}]} - z^{k} (z^{k})^{*} + I_{n} \vspace - 0.05 in

H^{(k)} = z^{k} (z^{k})^{*} + σ Δ^{(k)}, 1 \leq k \leq k_{max} . \vspace - 0.1 in

H^{(k)} = z^{k} (z^{k})^{*} + σ Δ^{(k)}, 1 \leq k \leq k_{max} . \vspace - 0.1 in

∥ u^{(k)} - z^{k} ∥_{\infty}

∥ u^{(k)} - z^{k} ∥_{\infty}

W_{ij}^{(k)} - z_{i}^{k} \overset{z}{ˉ}_{j}^{k}

\displaystyle\Bigg{|}\mathrm{Re}\left\{\sum_{k=1}^{k_{\mathrm{max}}}W^{\left(k\right)}_{ij}e^{-\iota k\phi}\right\}-\frac{1}{2}\left[\mathrm{Dir}_{k_{\mathrm{max}}}\left(\theta_{i}-\theta_{j}-\phi\right)-1\right]\Bigg{|}

\displaystyle\Bigg{|}\mathrm{Re}\left\{\sum_{k=1}^{k_{\mathrm{max}}}W^{\left(k\right)}_{ij}e^{-\iota k\phi}\right\}-\frac{1}{2}\left[\mathrm{Dir}_{k_{\mathrm{max}}}\left(\theta_{i}-\theta_{j}-\phi\right)-1\right]\Bigg{|}

\leq k = 1 \sum k_{max} (W_{ij}^{(k)} - z_{i}^{k} \overset{z}{ˉ}_{j}^{k}) e^{- ι k ϕ} \leq 2 C_{2} k_{max} σ lo g n / n

[2 k_{max} sin (π / (2 k_{max} + 1))]^{- 1} + 4 C_{2} σ lo g n / n < 1

[2 k_{max} sin (π / (2 k_{max} + 1))]^{- 1} + 4 C_{2} σ lo g n / n < 1

\hat{θ}_{ij} - (θ_{i} - θ_{j}) \leq 4 π / (2 k_{max} + 1) .

\hat{θ}_{ij} - (θ_{i} - θ_{j}) \leq 4 π / (2 k_{max} + 1) .

Corr (\overset{u}{^}, z) \geq 1 - C_{3} / k_{max}^{2} \vspace - 0.1 in

Corr (\overset{u}{^}, z) \geq 1 - C_{3} / k_{max}^{2} \vspace - 0.1 in

k_{max} > max {5, (2 π (1 - 4 C_{2} σ lo g n / n) - 2)^{- 1}} .

k_{max} > max {5, (2 π (1 - 4 C_{2} σ lo g n / n) - 2)^{- 1}} .

Corr (\overset{x}{^}, z) \geq 1 - 4 C_{3} / k_{max}^{2} .

Corr (\overset{x}{^}, z) \geq 1 - 4 C_{3} / k_{max}^{2} .

f (g) = k = 0 \sum \infty d_{k} tr (\hat{f} (k) ρ_{k} (g))

f (g) = k = 0 \sum \infty d_{k} tr (\hat{f} (k) ρ_{k} (g))

\hat{f} (k) = \int_{G} f (g) ρ_{k} (g) d g,

\hat{f} (k) = \int_{G} f (g) ρ_{k} (g) d g,

g_{1}, \dots, g_{n} \in G min i, j = 1 \sum n f_{ij} (g_{i} g_{j}^{- 1}), \vspace - 0.05 in

g_{1}, \dots, g_{n} \in G min i, j = 1 \sum n f_{ij} (g_{i} g_{j}^{- 1}), \vspace - 0.05 in

i, j = 1 \sum n f_{ij} (g_{i} g_{j}^{- 1}) = k = 0 \sum k_{max} i, j = 1 \sum n d_{k} tr [\hat{f}_{ij} (k) ρ_{k} (g_{i}) ρ_{k}^{*} (g_{j})]

i, j = 1 \sum n f_{ij} (g_{i} g_{j}^{- 1}) = k = 0 \sum k_{max} i, j = 1 \sum n d_{k} tr [\hat{f}_{ij} (k) ρ_{k} (g_{i}) ρ_{k}^{*} (g_{j})]

X^{(k)} = [ρ_{k} (g_{1}), \dots, ρ_{k} (g_{n})]^{⊤} .

X^{(k)} = [ρ_{k} (g_{1}), \dots, ρ_{k} (g_{n})]^{⊤} .

H^{(k)} = \frac{λ _{k}}{n} X^{(k)} (X^{(k)})^{*} + \frac{1}{n d _{k}} Δ^{(k)} \vspace - 0.1 in

H^{(k)} = \frac{λ _{k}}{n} X^{(k)} (X^{(k)})^{*} + \frac{1}{n d _{k}} Δ^{(k)} \vspace - 0.1 in

g_{ij} = {g_{i} g_{j}^{- 1}, \tilde{g} \sim Unif (G), with probability r with probability 1 - r

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBlind Source Separation Techniques · Direction-of-Arrival Estimation Techniques · Sparse and Compressive Sensing Techniques

Full text

Multi-Frequency Phase Synchronization

Tingran Gao

Zhizhen Zhao

Abstract

We propose a novel formulation for phase synchronization—the statistical problem of jointly estimating alignment angles from noisy pairwise comparisons—as a nonconvex optimization problem that enforces consistency among the pairwise comparisons in multiple frequency channels. Inspired by harmonic retrieval in signal processing, we develop a simple yet efficient two-stage algorithm that leverages the multi-frequency information. We demonstrate in theory and practice that the proposed algorithm significantly outperforms state-of-the-art phase synchronization algorithms, at a mild computational costs incurred by using the extra frequency channels. We also extend our algorithmic framework to general synchronization problems over compact Lie groups.

phase synchronization, spectral methods, cryo-EM, harmonic retrieval

1 Introduction

Angular or phase synchronization (Singer, 2011; Boumal, 2016) concerns estimating angles $\theta_{1},\dots,\theta_{n}$ in $\left[0,2\pi\right)$ from a subset of possibly noise-contaminated relative offsets $\left(\theta_{i}-\theta_{j}\right)\!\!\!\mod 2\pi$ . An instance of phase synchronization can be encoded on an observation graph $G=\left(V,E\right)$ , where each angle is assigned to a vertex $i\in V$ and relative offsets are measured between $\theta_{i}$ and $\theta_{j}$ if and only if there is an edge in $G$ connecting vertices $i$ and $j$ . Equivalently, the angles can be encoded into a column phase vector $z=(\exp\iota\theta_{1},\cdots,\exp\iota\theta_{n})^{\top}$ , and measurements constitute a Hermitian matrix

[TABLE]

where $A$ is the adjacency matrix of the observation graph $G$ , $\circ$ is the entrywise product, and the Hermitian matrix $\Delta\in\mathbb{C}^{n\times n}$ encodes measurement noise.

As a prototypical example of more general synchronization problems arising from many scientific fields concerning consistent pairwise comparisons within large collections of objects (e.g., cryogenic electron microscopy (Singer et al., 2011) and comparative biology (Gao et al., 2019)), phase synchronization attracted much attention due to its simple yet rich mathematical structure. One mathematical formulation is through nonconvex optimization

[TABLE]

where $\mathbb{C}_{1}^{n}$ is the Cartesian product of $n$ copies of $\mathrm{U}\!\left(1\right)$ . Depending on the context of the scientific problem, $H$ may be assumed to arise from an additive Gaussian noise model (Boumal, 2016; Bandeira et al., 2017), in which the Hermitian matrix $\Delta$ in (1) is a Wigner matrix with i.i.d. complex Gaussian entries above the diagonal, or from a random corruption model (Singer, 2011; Chen et al., 2016) that assumes

[TABLE]

for each edge $\left(i,j\right)\in E$ . Note that the random corruption model can also be cast in the form (1) after proper shifting and scaling. In general, the additive Gaussian noise model is more amenable to analysis, while the random corruption model is better at capturing the behavior of physical or imaging models where many outliers exist.

In this paper, we propose to tackle the phase synchronization problem by solving an alternative nonconvex optimization problem of “multi-frequency” nature, namely,

[TABLE]

where $k_{\mathrm{max}}$ is the number of frequency channels, $x^{k}$ is the entrywise $k$ th power of $x$ , and $H^{\left(k\right)}\in\mathbb{C}^{n\times n}$ is a Hermitian matrix containing information of the “true signal” $z$ in the $k$ th frequency component:

•

For the random corruption model (3), we construct $H^{\left(k\right)}$ directly from $H$ by entrywise power:

[TABLE]

•

For the additive Gaussian noise model, following (Bandeira et al., 2015; Perry et al., 2018), we assume

[TABLE]

where each $\Delta^{\left(k\right)}$ is a complex Hermitian random matrix with independent upper diagonal entries, and the scaling $\sigma_{k}$ is chosen such that the operator norm of $\Delta^{\left(k\right)}$ is upper bounded by $\sqrt{n}$ . Unlike (Bandeira et al., 2015; Perry et al., 2018), we allow entries of $\Delta^{\left(k\right)}$ to be general sub-Gaussian random variables rather than restrictively complex Gaussian, and we do not assume independence of the $\Delta^{\left(k\right)}$ ’s across different $k$ ’s.

We treat the two types of noise (5) (6) in a unified model, under which we design and analyze our multi-frequency phase synchronization algorithm. We demonstrate surprising theoretical and empirical results that drastically outperform all existing phase synchronization algorithms in their corresponding settings, measured in terms of the correlation between the output and the true phase vector $z$ , at a mild increase in the computational cost incurred by parallelizing the computation in $k_{\mathrm{max}}$ frequency channels. As will be demonstrated in Section 4, in the noise regime where phase synchronization is tractable, the number of frequencies $k_{\mathrm{max}}$ needed to outperform single frequency algorithms is at most polylogarithmically dependent on the problem size $n$ , while the estimation error decays polynomially in $k_{\mathrm{max}}$ .

Motivation

The rationale behind the multi-frequency formulation (4) lies at the observation that statistical estimation can often benefit from higher moment estimates, even without introducing new measurements. As a motivating example, let $G$ be a complete graph, and consider the following $k_{\mathrm{max}}=2$ coupled problems:

[TABLE]

where $H^{\left(1\right)}=H$ , and $H^{\left(2\right)}$ is generated according to (5). Up to rescaling by a factor of $1/r$ , $H^{\left(1\right)}$ and $H^{\left(2\right)}$ fit into model (6) with

[TABLE]

where $\varphi_{ij}$ are i.i.d. uniform on $\mathbb{R}/2\pi$ for $\left(i,j\right)\in E$ , and $\varphi_{ji}=-\varphi_{ij}$ . Note that $\sigma_{1}\Delta^{\left(1\right)}$ and $\sigma_{2}\Delta^{\left(2\right)}$ are by no means independent, but for all practical purposes satisfy the same sub-Gaussian bounds since $e^{\iota\varphi_{ij}}$ and $e^{\iota 2\varphi_{ij}}$ are identically distributed; we thus assume without loss of generality that $\sigma_{1}=\sigma_{2}$ . If we can find $\hat{x}\in\mathbb{C}_{1}^{n}$ satisfying jointly

[TABLE]

then, by Lemma 1 of (Boumal, 2016) (assuming without loss of generality that $\hat{x}^{*}z=|\hat{x}^{*}z|$ ), we have for $k=1,2$

[TABLE]

which gives $\left|z^{*}\hat{x}\right|\geq\max_{k=1,2}\left\{\left(n-8\sigma^{2}\|\Delta^{\left(k\right)}\|_{2}^{2}/n\right)^{\frac{1}{k}}\right\}$ , a tighter bound than one could obtain from (9) with $k=1$ alone, especially for large $\sigma$ (with $n-8\sigma^{2}\|\Delta^{\left(k\right)}\|_{2}^{2}/n<1$ ).

The lesson we learn from this motivating example is that statistical estimation can benefit from leveraging higher-order moment information, even when the moment measurements are not essentially independent of each other. This is particularly prominent for the random corruption model, where all the “higher-order trigonometric moments” in $H^{\left(k\right)}$ , $k>1$ come from the first moments in $H=H^{\left(1\right)}$ by taking entrywise powers. In drastic contrast is the message-passing algorithm in (Perry et al., 2018), for which independence of the complex Gaussian Wigner noise $\Delta^{\left(k\right)}$ ’s across the frequency channels play an essential role. The AMP approach was motivated by the non-unique games (NUG) framework in (Bandeira et al., 2015). Our algorithm follows an efficient two-stage paradigm (initialization and iterative refinement) popularized by recent progress in nonconvex optimization (see, e.g. (Candes et al., 2015; Chen & Candes, 2015)), and combines the trigonometric moments information across frequency channels in a manner akin to classical harmonic retrieval techniques in signal processing (Stoica & Moses, 1997; Tufts & Kumaresan, 1982; Bresler & Macovski, 1986; Ziskind & Wax, 1988; Schmidt, 1986; Roy & Kailath, 1989; Sorensen & De Lathauwer, 2017a, b) and the generalized power method (Boumal, 2016). This strategy easily extends to synchronization over general compact Lie groups, as illustrated in Section 5.

Notations

Upper case letters $A,B,C,\cdots$ and lower case letters $a,b,c,\cdots$ will be used to denote matrices and vectors, respectively. $A^{*}$ , $A^{\top}$ are the transpose of $A$ with or without conjugation, respectively. The entrywise (Hadamard) product of matrix $A$ and $B$ will be denoted as $A\circ B$ . Graphs $G=\left(V,E\right)$ are always undirected and connected. Vertices of the graph will be denoted as integers $1,2,\cdots,\left|V\right|$ ; pairs of integers $\left(i,j\right)$ denote edges in $E$ . For $n\in\mathbb{N}$ we write $\left[n\right]:=\left\{1,\cdots,n\right\}$ . Norms $\left\|\cdot\right\|_{2}$ , $\left\|\cdot\right\|_{\infty}$ stand for matrix or vector norms, depending on the context; $\left\|\cdot\right\|_{\mathrm{op}}$ , $\left\|\cdot\right\|_{\textrm{F}}$ are matrix operator and Frobenius norms, respectively. The Cartesian product of $n$ copies of $\mathrm{U}\!\left(1\right)$ is denoted as $\mathbb{C}_{1}^{n}$ . The quotient space $\mathbb{R}/2\pi$ is identified with the unit circle.

2 Related Work

Phase synchronization

Directly solving (2) is NP-hard (Zhang & Huang, 2006), but many convex and nonconvex methods have been proposed to find high quality approximate solutions. These include spectral and semi-definite programming (SDP) relaxations (Singer, 2011; Cucuringu et al., 2012; Chaudhury et al., 2015; Bandeira et al., 2016, 2017). An alternative approach using generalized power method (GPM) is also studied (Boumal, 2016; Liu et al., 2017; Zhong & Boumal, 2018).

Phase synchronization in multiple frequency channels

(Bandeira et al., 2015) proposed the non-unique games (NUG) SDP optimization framework for synchronization over compact Lie groups. The SDP is based on quadratically lifting the irreducible representations of the group elements, and imposing consistency among variables across frequency channels via a Féjer kernel; it is computationally expensive. (Perry et al., 2018) introduced an iterative approximate message passing (AMP) algorithm for noise model (6), assuming the noise are Gaussian and independent across frequency channels. Each iteration of the AMP performs matrix-vector multiplication and entrywise nonlinear transformation, followed by an extra Onsager correction term; it is conjectured to be asymptotically optimal.

3 Algorithm

In this section we formally state the two-stage multi-frequency phase synchronization algorithmic paradigm. Stage One combines phase synchronization outcomes from individual frequency channels with harmonic retrieval, aiming at producing a high-quality initialization; Stage Two iteratively refines an input by an extended generalized power method that works concurrently in multiple frequency channels while striving to maintain entrywise consistency.

3.1 Stage One: Initialization Strategy

Our algorithm takes as input $k_{\mathrm{max}}$ Hermitian measurement matrices $H^{\left(k\right)}$ , $k=1,\dots,k_{\mathrm{max}}$ , arising from the general sub-Gaussian model (6) (which includes (5) as a special case). This stage can be divided into three steps.

Step 1. Individual Frequency Synchronization: Apply any phase synchronization algorithm (spectral/SDP relaxation or GPM) to get phase vector estimate $u^{\left(k\right)}\in\mathbb{C}^{n}$ from each $H^{\left(k\right)}$ , $k=1,\cdots,k_{\mathrm{max}}$ , and form $W^{\left(k\right)}\coloneqq u^{\left(k\right)}(u^{\left(k\right)})^{*}$ ;

Step 2. Entrywise Harmonic Retrieval: For each $\left(i,j\right)\in E$ , use any harmonic retrieval technique to estimate $\theta_{i}-\theta_{j}$ from $W^{\left(k\right)}_{ij}$ , $k=1,2,\cdots,k_{\mathrm{max}}$ , call the estimators $\hat{\theta}_{ij}$ ;

Step 3. Final Phase Synchronization: Construct another Hermitian matrix $\widehat{H}\in\mathbb{C}^{n\times n}$ by $\widehat{H}_{ij}\coloneqq e^{\iota\hat{\theta}_{ij}}$ , and apply any phase synchronization algorithm to estimate the true phases $\{e^{\iota\theta_{1}},\cdots,e^{\iota\theta_{n}}\}$ from matrix $\widehat{H}$ .

The flexibility of the multi-frequency phase synchronization framework lies at the various choices to be made in each step. As a concrete example, we detail in Algorithm 1 a simple version that uses spectral relaxation for phase synchronization and periodogram-based harmonic retrieval. We will henceforth refer to Algorithm 1 as the periodogram peak extraction with spectral methods (PPE-SPC). If a different phase synchronization method is used, for instance, SDP relaxation, our nomenclature refers to it as PPE-SDP. We will focus on analyzing PPE-SPC in depth in Section 4, but the analysis strategy can be seamlessly carried in principle to other variants of this algorithmic paradigm.

We briefly motivate the argmax operation in Step 2 as follows. If our measurement matrices are noise-free, then the $\left(i,j\right)$ th entry of $W^{\left(k\right)}$ from Step 1 should equal to $e^{\iota k\left(\theta_{i}-\theta_{j}\right)}$ ; in this case, the goal of Step 2 is to reconstruct $\left(\theta_{i}-\theta_{j}\right)$ from its “trigonometric moments,” for which any harmonic retrieval technique can be applied; the periodogram method in Algorithm 1 is among the most naïve approach for this purpose. For clean signal, the periodogram $|\mathrm{Re}\{\sum_{k=1}^{k_{\mathrm{max}}}W^{\left(k\right)}_{ij}e^{-\iota k\phi}\}|$ is equal to the modulus of the Dirichlet kernel

[TABLE]

which attains its maximum at $\theta_{ij}=\theta_{i}-\theta_{j}\left(\mathrm{mod}2\pi\right)$ . Since the peak of $\mathrm{Dir}_{k_{\mathrm{max}}}$ becomes sharper and sharper as $k_{\mathrm{max}}$ increases, we expect the periodogram peak identification step to be robust to noise, which will produce a very high quality estimate $\widehat{H}$ for Step 3. In fact, our analysis in Section 4 suggests that this initialization stage alone can produce highly accurate phase vectors for sufficiently large $k_{\mathrm{max}}$ , and the estimation error drops inverse-polynomially in $k_{\mathrm{max}}$ .

3.2 Stage Two: Iterative Refinement

In this stage, we use an iterative refinement scheme that takes an initial phase vector and enhances it successively. In our implementation we warm-start this iterative algorithm with the $\hat{x}$ produced from the PPE-SPC Algorithm 1, but any initialization scheme can be applied in principle, including random initialization. This iterative refinement concurrently performs the generalized power method (GPM) (Boumal, 2016) in multiple frequency channels consistently: at each frequency $k$ , we perform power iteration by multiplication with $H^{\left(k\right)}$ ; the results are combined across frequency channels to obtain one periodogram for each vertex $i$ followed by a “soft harmonic retrieval” step that soft-thresholds (Donoho, 1995) the periodogram in frequency domain. We pick a relatively lower threshold at the beginning of this iterative scheme, but gradually raise the threshold over $0.99$ to reveal the true peak that persists. Details can be found in Algorithm 2, henceforth referred to as multi-frequency generalized power method (MFGPM).

MFGPM can be viewed as an iterative version of PPE-SPC, except that the stringent peak extraction step is replaced with the more malleable soft-thresholding. Periodograms $h_{i}^{\left(t\right)}$ are virtually the Dirichlet kernels, which truncate a Dirac delta function in the frequency domain; one can also take Cesáro means of these periodograms, or equivalently, work with the Féjer kernels that are known to converge faster to the Dirac delta function. We omit those results as no significant difference is observed in performance.

As an integral part of our two-stage algorithmic framework, MFGPM works most efficiently with initialization from PPE-SPC, but we also observed empirically that the MFGPM outperforms other methods given identical random initialization, illustrated in Figure 1, in the sense that MFGPM often produces phase vectors that correlate more strongly with the true phase vector $z$ . See Section 6 for more comprehensive comparisons results.

The computational complexities of PPE-SPC and MFGPM are $\mathcal{O}\left(k_{\mathrm{max}}n^{3}\right)$ and $\mathcal{O}\left(Tk_{\mathrm{max}}n^{2}\right)$ , respectively.

4 Analysis

In this section we analyze PPE-SPC in theory, under the general sub-Gaussian noise model (6). We assume the observation graph $G$ is generated from a Erdős–Rényi model with edge connectivity $p\in\left[0,1\right]$ independent of the $\Delta^{\left(k\right)}$ ’s.

Assumption 1.

For $\sigma>0$ and each $k\in[k_{\mathrm{max}}]$ , assume

[TABLE]

where $z^{k}\in\mathbb{C}_{1}^{n}$ is the entrywise $k$ th power of $z$ , and $\Delta^{\left(k\right)}$ , $k=1,\dots,k_{\mathrm{max}}$ are complex random Wigner matrices satisfying the following assumptions:

(1)

*For any fixed $k\in[k_{\mathrm{max}}]$ , $\{\mathrm{Re}(\Delta^{\left(k\right)}_{\ell j}),\mathrm{Im}(\Delta^{\left(k\right)}_{\ell j})\mid 1\leq\ell<j\leq n\}$ are jointly independent with zero mean, and unit sub-Gaussian norm (Vershynin, 2018); * 2. (2)

$\Delta_{ii}^{\left(k\right)}=0$ * for all $1\leq k\leq k_{\mathrm{max}}$ and $1\leq i\leq n$ ;* 3. (3)

$\Delta^{\left(k\right)}_{\ell j}=\overline{\Delta^{\left(k\right)}_{j\ell}}$ * for all $k=1,\dots,k_{\mathrm{max}}$ and $\ell<j$ .*

Furthermore, assume $A$ is the adjacency matrix of a Erdős–Rényi random graph independent of all the $\Delta^{\left(k\right)}$ ’s, with edge connecting probability $p\in\left[0,1\right]$ .

We emphasize again that Assumption 1 assumes no independence for the $\Delta^{\left(k\right)}$ ’s across frequency channels; only entries within the same $\Delta^{\left(k\right)}$ are assumed independent. As explained in Introduction, this enables us to unify our discussions on the random corruption model and additive Gaussian model in a single pass (see e.g., (8)). Another advantage for such generality is that we can focus on analyzing complete observation graphs, since

[TABLE]

where $I_{n}$ is the identify matrix of dimension $n$ -by- $n$ , and thus we can apply the theoretical analysis in this section to $\frac{1}{p}\left(H^{\left(k\right)}+pI_{n}\right)=z^{k}\left(z^{k}\right)^{*}+E^{\left(k\right)}$ where

[TABLE]

satisfies the same conditions as $\Delta^{\left(k\right)}$ in Assumption 1 with different absolute constants. Therefore, in the rest of this section we focus on complete observation graph $G$ only, i.e.,

[TABLE]

Our first goal is to understand the spectral method in PPE-SPC Step 1 and Step 3. Since Step 2 is entrywise, it is crucial to bound the $\ell_{\infty}$ distance between $z^{k}$ and the leading eigenvector $u^{\left(k\right)}$ (scaled to $\|u^{\left(k\right)}\|_{2}=\sqrt{n}$ ). The proof of the following Lemma 1 uses recent $\ell_{\infty}$ perturbation results of eigenvectors of random matrices (Eldridge et al., 2017; Abbe et al., 2017; Fan et al., 2018; Zhong & Boumal, 2018) and can be found in the supplemental material.

Lemma 1.

Assume Assumption 1 is satisfied, and the observation graph $G$ is a complete graph. Let $\epsilon\in\left(0,2\right]$ be an arbitrarily chosen but fixed absolute constant. For any $k\in[k_{\mathrm{max}}]$ , denote $u^{\left(k\right)}$ for the leading eigenvector of $H^{\left(k\right)}$ scaled such that $\left\|u^{\left(k\right)}\right\|_{2}=\sqrt{n}$ and $(z^{k})^{*}u^{\left(k\right)}=|(z^{k})^{*}u^{\left(k\right)}|$ . There exist absolute (in particular, independent of $k$ and $n$ ) constants $c_{0},C_{0},C_{2}>0$ such that, if $\sigma<c_{0}\sqrt{n/\log n}$ , there holds with probability $1-\mathcal{O}\left(n^{-\left(2+\epsilon\right)}\right)$

[TABLE]

The inequality (15) is a direct consequence of (14), which is identical to Theorem 8 of (Zhong & Boumal, 2018), but we verify in the proof that the event probability $1-\mathcal{O}(n^{-2})$ in (Zhong & Boumal, 2018) can be made slightly higher. This is necessary for taking the union bound across all $\mathcal{O}(n^{2})$ entries in the main Theorem 2.

A quick consequence of Lemma 1 is the uniform proximity of the periodogram to a Dirichlet kernel up to constant scaling and shifts, with high probability. More specifically,

[TABLE]

with probability $1-\mathcal{O}\left(n^{-\left(2+\epsilon\right)}\right)$ . Clearly, the maximum of $\left|\mathrm{Dir}_{k_{\mathrm{max}}}\left(\theta_{i}-\theta_{j}-\phi\right)-1\right|$ is attained at $\theta=\theta_{i}-\theta_{j}$ . We thus expect the argmax operation in Step 2 of PPE-SPC to produce high accuracy estimates of $\theta_{i}-\theta_{j}$ as long as the difference between the “optimization landscape” of the periodogram and the Dirichlet kernel is small enough. This is formalized in the following lemma, which exploits the geometry of the Dirichlet kernel.

Lemma 2.

Under the same conditions as in Lemma 1, if

[TABLE]

then with probability at least $1-\mathcal{O}\left(n^{-\left(2+\epsilon\right)}\right)$

[TABLE]

It is straightforward to check that (16) holds for sufficiently large $k_{\mathrm{max}}$ as long as $4C_{2}\sigma\sqrt{\log n/n}$ is bounded from above by $1-1/\pi$ . This can be seen by noticing that the function $\left[2x\sin\left(\pi/\left(2x+1\right)\right)\right]^{-1}$ is differentiable and monotonically decreasing for all $x\geq 2$ , and for sufficiently large $k_{\mathrm{max}}$ it infinitesimally approaches $1/\pi<1$ .

The most important message from Lemma 2 is the following: At the beginning of the Step 3 of PPE-SPC, the newly constructed Hermitian matrix $\widehat{H}$ is entrywise $\mathcal{O}\left(k_{\mathrm{max}}^{-1}\right)$ –close to the ground truth rank-one matrix $zz^{*}$ . We emphasize that this error incurred in $\widehat{H}$ is significantly smaller than the noise level $\sigma$ in the raw input data, and can be made arbitrarily small by choosing large $k_{\mathrm{max}}$ . We formalize this key observation in the main theorem below, for which the proof is deferred to the supplemental material.

Theorem 2.

Under the same conditions as Lemma 1 and Lemma 2, if (16) holds and $4c_{0}C_{2}<1-\sqrt{2}/\pi$ , then there exists an absolute constant $C_{3}>0$ such that, with probability $1-\mathcal{O}\left(n^{-\epsilon}\right)$ , the correlation between the true phase vector $z$ and the leading eigenvector $\hat{u}$ (scaled to $\left\|\hat{u}\right\|_{2}=\sqrt{n}$ ) of $\widehat{H}$ in PPE-SPC Step 3 is at least

[TABLE]

provided that

[TABLE]

Moreover, for the phase vector $\hat{x}$ output from PPE-SPC,

[TABLE]

Following the discussion after Lemma 2, it is not surprising to see in Theorem 2 that the correlation can be made arbitrarily close to $1$ (or equivalently, the $\ell_{2}$ distance between the estimated and true phase vectors can be made arbitrarily close to [math]). Moreover, it doesn’t take excessively large $k_{\mathrm{max}}$ for PPE-SPC to outperform all existing phase synchronization algorithms—in fact, for $\sigma\asymp\mathcal{O}(\sqrt{n/\log n})$ which is the highest level of noise tolerable to ensure the validity of Lemma 1, it suffices to take $k_{\mathrm{max}}=\mathcal{O}\left(\sqrt{n}/\sigma\right)\asymp\mathcal{O}\left(\sqrt{\log n}\right)$ to suppress the $\ell_{2}$ estimation error below the established near-optimal bound $\mathcal{O}\left(\sigma\right)$ for eigenvector based phase synchronization methods (Bandeira et al., 2017; Zhong & Boumal, 2018). We believe (18) can still be improved by a factor of $\sqrt{n}$ by leveraging the randomness in the residue error in (15), but such finer analysis relies on more detailed analysis on the $\ell_{\infty}$ perturbation and the change in the optimization landscape, which will be pursued in a future work.

5 Extension to General Synchronization

The algorithmic framework of multi-frequency phase synchronization proposed in this paper can be extended to synchronization over any compact Lie group $\mathcal{G}$ , by the representation-theoretic analogue of Fourier series — the Peter–Weyl decomposition. In a nutshell, the Peter–Weyl theorem states that, for square integrable functions $f\in L^{2}\left(\mathcal{G}\right)$ , we have decomposition

[TABLE]

where each $\rho_{k}:\mathcal{G}\rightarrow\mathbb{C}^{d_{k}\times d_{k}}$ is an irreducible, unitary representation of $\mathcal{G}$ , and $\hat{f}\left(k\right)$ is the “Fourier coefficient”

[TABLE]

where the integral is take with respect to the Haar measure.

On a connected observation graph $G$ , the input data to a synchronization problem over group $\mathcal{G}$ are pairwise measurements $g_{ij}\in\mathcal{G}$ on edges $\left(i,j\right)\in E$ satisfying $g_{ij}=g_{ji}^{-1}$ . The goal is to find $n$ group elements $g_{1},\dots,g_{n}\in\mathcal{G}$ , one for each vertex, that satisfy as many constraints $g_{ij}=g_{i}g_{j}^{-1}$ as possible. Mathematically, this type of problems can often be formulated as an optimization problem (Bandeira et al., 2015)

[TABLE]

where each $f_{ij}\in L^{2}\left(\mathcal{G}\right)$ measures the compatibility between the relative alignment $g_{i}g_{j}^{-1}$ and the observation data $g_{ij}$ on edge $\left(i,j\right)\in E$ . The $f_{ij}$ ’s are nonlinear and nonconvex in general. If $f_{ij}$ are bandlimited, we can expand (21) using the Peter–Weyl decomposition

[TABLE]

which can be viewed as a generalization of the multi-frequency phase synchronization problem (4).

For simplicity of statement, we assume the observation graph $G$ is complete in this section. Since $\rho_{k}$ ’s are unitary representations, the matrices $\rho_{k}\left(g\right)$ ’s are unitary matrices for any $g\in\mathcal{G}$ , and it is natural to solve for $g_{i}$ from its irreducible representations $\rho_{k}\left(g_{i}\right)$ . Vertically stacking the $k$ th irreducible representations together, the variable can be organized in matrices $X^{\left(k\right)}\in\mathbb{C}^{nd_{k}\times d_{k}}$ , $k\in\mathbb{Z}$ defined by

[TABLE]

Analogies of the noise models also exist in this more general setting. The additive Gaussian noise model, following (Perry et al., 2018), amounts to

[TABLE]

where the parameter $\lambda_{k}>0$ stands for the signal-to-noise ratio (SNR) at “frequency $k$ ,” $\Delta^{(k)}\in\mathbb{C}^{nd_{k}\times nd_{k}}$ is a Wigner matrix with i.i.d. standard complex Gaussian entries in the upper triangular part. For the random corruption model, let

[TABLE]

and set the $(i,j)$ th sub-block of $H^{(k)}$ to $\rho_{k}(g_{ij})$ .

As we elaborate in the remainder of this section, all the key ingredients in PPE-SPC and MFGPM can be extended to this more general setting. We demonstrate the efficacy of this algorithm for $\mathrm{SO}\!\left(3\right)$ synchronization in Section 6.

**Spectral relaxation: ** Compute the top $d_{k}$ eigenvectors and stack them horizontally to form $U^{(k)}=[u^{(k)}_{1},\dots,u^{(k)}_{d_{k}}]$ . Approximate $H^{(k)}$ with $\widehat{H}^{(k)}=U^{(k)}\left(U^{(k)}\right)^{*}$ .

**Generalized harmonic retrieval: ** For each $\left(i,j\right)\in E$ , set

[TABLE]

Based on these new estimates for the pairwise alignments, we build matrix $\widehat{H}$ with $n^{2}$ blocks with $\widetilde{H}_{ij}=\rho_{1}(\hat{g}_{ij})$ . We then extract the top $d_{1}$ eigenvectors of $\widehat{H}$ , stack them horizontally to form $\widetilde{U}=[u_{1},u_{2},\dots u_{d_{1}}]$ , and project each of its $n$ vertical blocks $\widetilde{U}_{1}\dots,\widetilde{U}_{n}\in\mathbb{C}^{d_{1}\times d_{1}}$ to a unitary matrix through singular value decomposition (SVD)

[TABLE]

**Iterative refinement: ** At the $t$ th iteration, denoting $X^{(k,t)}$ for the current stacked $k$ th representations (22), we construct

[TABLE]

and compute the inverse Fourier transform for each of the $n$ vertical sub-blocks $Y_{1}^{(k)},\dots,Y_{n}^{(k)}\in\mathbb{C}^{d_{k}\times d_{k}}$ of $Y^{(k)}$ :

[TABLE]

Note that we only need toe evaluate $C_{i}(g)$ on a finite number of uniformly sampled elements of $\mathcal{G}$ , from which the “inverse Fourier transform” can be applied

[TABLE]

along with the soft-thresholding $\eta_{\tau}$ . We again project each $U_{i}^{(k)}$ to the closest unitary matrix by SVD (26), then form $X^{(k,t+1)}$ by vertically stacking the $\mathrm{Proj}(U_{i}^{\left(k\right)})$ ’s. The final outputs are $\widehat{X}^{\left(k\right)}=X^{\left(k,T\right)}$ for $k=1,\dots,k_{\mathrm{max}}$ .

6 Numerical Experiments

This section contains detailed numerical results under both additive Gaussian noise and random corruption models, for both $\mathrm{U}\!\left(1\right)$ and $\mathrm{SO}\!\left(3\right)$ . In all experiments with Gaussian noise, we keep $\sigma_{k}\equiv\sigma\equiv\sqrt{n}/\lambda$ where $\lambda>0$ is the signal-to-noise ratio (SNR); for the random corruption model (3) we set $r\equiv\lambda/\sqrt{n}$ . We fix $n=100$ and vary $\lambda$ and $k_{\mathrm{max}}$ to evaluate and compare the performance of different algorithms. When comparing iterative algorithms (AMP, GPM, MFGPM), within each random trial the random initialization is kept identical for all three algorithms and across frequency channels; between trials both data and initialization are redrawn. The remainder of the section contains results for $\mathrm{U}\!\left(1\right)$ and $\mathrm{SO}\!\left(3\right)$ synchronization with complete observation graphs only; incomplete observation graph results are similar and included in the supplemental material.

$\mathrm{U}\!\left(1\right)$ ** synchronization: ** In Figure 2 and Figure 3, we measure the correlation between the output and the truth phase vector for various single- and multi-frequency synchronization methods, under the additive Gaussian and random corruption noise model, respectively. The SNR $\lambda$ varies between $0.7$ and $1.3$ , which is in the extremely noisy regime: under the random corruption model, for instance, with $n=100$ , between $87\%$ and $93\%$ of the pairwise alignments are corrupted with random elements. In each subplot, the vertical axis varies $k_{\mathrm{max}}$ from $1$ to $1024$ , and the horizontal axis marks the change in $\lambda$ . The bottom row in each subplot thus represents the single-frequency ( $k_{\mathrm{max}}=1$ ) version of the algorithm. The methods under comparison are: (a) AMP (Perry et al., 2018) with random initialization; (b) PPE-SPC; (c) MFGPM with random initialization; (d) PPE-SDP (replacing the spectral methods in Algorithm 1 with SDP relaxation); (e) PPE-SDP with an additional projection to rank-one matrices in each iteration; (f) Iterating PPE-SPC three times; (g) AMP initialized with PPE-SPC; (h) MFGPM initialized with PPE-SPC.

It is clear from Figure 2 and Figure 3 that leveraging information in multiple frequency channels produces superior results than single-frequency approaches. Most shockingly, in Figure 3 our proposed PPE-SPC method and variants [subplots (b)–(h)] are capable of recovering the true phase vector when the SNR is well below the critical threshold $\lambda=1$ (corresponding to $r<1/\sqrt{n}$ ) determined in (Singer, 2011) by random matrix arguments. This is surprising because, according to (Singer, 2011), for single frequency phase synchronization one can not expect correlation to be much higher than $1/\sqrt{n}$ , which is $0.1$ in our experiments. This is confirmed by looking at the bottom row of each subplot of Figure 3, but with suitably large $k_{\mathrm{max}}$ this barrier no longer exists, even though in model (5) our high-frequency measurements are generated from the single frequency data.

In Figure 2 and Figure 3, (d) and (e) illustrates the performance of the SDP variant of PPE-SPC. The difference between (d) and (e) is the following: in (d) we use directly estimated $W^{(k)}$ by solving the SDP in (Singer, 2011), but in (e) we apply project the SDP solution to a rank-one matrix using eigen-decomposition. The results from these SDP variants are occasionally slightly better PPE-SPC, but the computational cost is expensive: the runtime is over $40$ times longer, and a lot more memory is required. The SDP relaxation in (Bandeira et al., 2015) is even more demanding on computation resources so is not included here.

Figures 2f and 3f explore another possibility of extending PPE-SPC: After recovering $\widehat{H}$ , take entrywise powers of $\hat{H}$ and treat them as multi-frequency data input to another fresh run of PPE-SPC. Unlike the iterative refinement algorithm MFGPM, we observed empirically that the performance boost saturate quickly after just a couple of such repeated calls to PPE-SPC. The result in (f) from both figures are obtained from performing $3$ such repetitions. Compared with (b), this strategy improves the estimation accuracy for smaller $\lambda$ , but the performance gain is not as significant as using MFGPM for iterative refinements (h).

Initialization turns out to be important for AMP: As shown in Figure 2a, when the SNR is below the critical threshold predicted in (Perry et al., 2018) ( $\lambda<1$ ), increasing $k_{\mathrm{max}}$ does not lead to performance improvement; the critical threshold appears even higher for random corruption model (Figure 3a). In contrast, PPE-SPC and MFGPM can always benefit from sufficiently larger $k_{\mathrm{max}}$ .

$\mathrm{SO}\!\left(3\right)$ ** synchronization: ** Comparison results for $\mathrm{SO}\!\left(3\right)$ synchronization under Gaussian noise model and random corruption model are shown in Figure 4a and 4b, respectively. In all these experiments, the Fourier transform (27) is numerically evaluated using $m=1000$ elements uniformly sampled in $\mathrm{SO}\!\left(3\right)$ . Clearly, the proposed method outperforms single frequency methods and achieve higher accuracy as $k_{\mathrm{max}}$ increases; moreover, the multi-frequency formulation and algorithm lead to drastic performance boost especially at the “low SNR regime.”

In Figure 5 we compare AMP and MFGPM with different initialization strategies–PPE-SPC vs. random initialization–under the additive Gaussian noise model (23) with $k_{\mathrm{max}}=8$ . We plot the accuracy of using PPE-SPC alone without iterative refinement as a baseline. The results demonstrate the performance boost from using PPE-SPC for initialization, as well as improvements gain from using iterative refinements on top of the initialization PPE-SPC.

7 Conclusion

In this paper, we propose a novel, mult-frequency formulation for phase synchronization as a nonconvex optimization problem, for which we develop a two-stage algorithm inspired by harmonic retrieval and generalized power method that produces high accuracy approximate solutions. We demonstrate in theory and experiments that the new framework significantly outperform all existing phase synchronization algorithms.

There are many opportunities for future research. We are particularly interested in gaining deeper theoretical understandings for the multi-frequency GPM algorithm, especially its performance guarantees and behavior near local optimum. More general harmonic retrieval techniques can be potentially used in place of the periodogram-based peak extraction. We are also working on extending the algorithmic framework beyond compact Lie groups, such as Euclidean groups and symmetric groups, with applications to object matching (Shen et al., 2016; Pachauri et al., 2013).

Acknowledgements

Tingran Gao acknowledges support from an AMS-Simons Travel Grant and partial support from DARPA D15AP00109 and NSF IIS 1546413.

Appendix A Technical Proofs

A.1 Proof of Lemma 1

Proof of Lemma 1.

The conclusion of this lemma is identical to Theorem 8 of (Zhong & Boumal, 2018); the only difference is that the event probability is slightly larger — in Theorem 8 of (Zhong & Boumal, 2018) the event probability is $1-\mathcal{O}\left(n^{-2}\right)$ . This can be done by straightforwardly modifying the arguments in the proof of the Theorem 8 of (Zhong & Boumal, 2018), and at the expense of increasing the absolute constant picked in that proof. Actually, this is already stated by the authors of (Zhong & Boumal, 2018) on page 998 of the published version, in the paragraph right below their Theorem 5. We document here how this modification can be done.

The randomness in the proof of Theorem 8 of (Zhong & Boumal, 2018) arises only from the dependence of Lemma 9 and Lemma 10 of (Zhong & Boumal, 2018), so it is sufficient to track the failure probability of the events there. These modifications only need to be stated for real sub-Gaussian random variables, as the trivial passage from real to complex cases is the same as detailed in the proof of Lemma 9 of (Zhong & Boumal, 2018).

Lemma 9 of (Zhong & Boumal, 2018) is based on the well-known concentration results on the maximum singular value of sub-Gaussian random matrices, in particular, Proposition 2.4 of (Rudelson & Vershynin, 2010), which states for any sub-Gaussian random matrix $A$ of dimension $n$ -by- $n$ with independent, zero mean sub-Gaussian entries (whose subgaussian moments are bounded by $1$ ) that, for any $t>0$ ,

[TABLE]

where $c,C>0$ are positive absolute constants. We take here $t=C\sqrt{n}$ , so $\left\|A\right\|_{2}\lesssim\sqrt{n}$ with probability at least $1-2e^{-cC^{2}n}$ . Obviously, there exists sufficiently large absolute constant $C_{2}>0$ such that

[TABLE]

where $\epsilon\in\left(0,2\right]$ is the arbitrarily chosen but fixed constant in the statement of our Lemma 1.

Lemma 10 of (Zhong & Boumal, 2018) attains the event probability $1-\mathcal{O}\left(n^{-2}\right)$ by taking a union bound, over $n$ instances of $1\leq m\leq n$ and $\left|\mathcal{U}_{m}\right|$ instances of $u\in\mathcal{U}_{m}$ , for individual event probabilities of $1-4en^{-5}-4e^{-c_{2}n/4}$ , where $c_{2}$ is an absolute positive constant. However, note that in the case of eigenvectors, we have $\left|\mathcal{U}_{m}\right|=1$ (consisting of a singleton, cf. the second paragraph on pp.1000 of (Zhong & Boumal, 2018), right above section title “Introducing auxiliary eigenvector problems”), which is two orders of magnitude smaller than the bound $\left|\mathcal{U}_{m}\right|\leq 3n^{2}$ stated in Lemma 10 of (Zhong & Boumal, 2018). The union bound thus yields the success probability of at least $1-4en^{-4}-4ne^{-c_{2}n/4}$ , which is $1-\mathcal{O}\left(n^{-4}\right)$ .

Combining both ends lead to the success probability of $1-\mathcal{O}\left(n^{-\left(2+\epsilon\right)}\right)$ for any $\epsilon\in\left(0,2\right]$ .

For the last inequality, note that $z=(e^{\iota\theta_{1}},\cdots,e^{\iota\theta_{n}})^{\top}$ , $e^{\iota k\left(\theta_{i}-\theta_{j}\right)}=z_{i}^{k}\overline{z_{j}^{k}}$ , and $W_{ij}^{\left(k\right)}=u^{\left(k\right)}_{i}\overline{u_{j}^{\left(k\right)}}$ , and note that $\left|z_{i}^{k}\right|=1$ for all $1\leq i\leq n$ and $1\leq k\leq k_{\mathrm{max}}$ . We have

[TABLE]

where in the last inequality we used the assumption $\sigma<c_{0}\sqrt{n/\log n}$ . ∎

A.2 Proof of Lemma 2

Proof of Lemma 2.

The proof starts with some elementary observations for the Dirichlet kernel $\mathrm{Dir}_{m}:\left[0,2\pi\right]\rightarrow\mathbb{R}$ , defined as

[TABLE]

Note the following (cf. Figure 6):

(1)

$\left|\mathrm{Dir}_{m}\left(x\right)\right|$ is upper bounded by $1/\sin\left(x/2\right)$ ; 2. (2)

$\left|\mathrm{Dir}_{m}\left(x\right)\right|$ vanishes at $2\pi\ell/\left(2m+1\right)$ , for $\ell\in\left[2m\right]$ ; 3. (3)

A unique local maximum exists between each pair of consecutive zeros on $\mathbb{R}/2\pi$ .

Let $\theta_{*}$ be the local maximizer attaining the highest “side lobe” of $\left|\mathrm{Dir}_{m}\left(x\right)\right|$ between $2\pi/\left(2m+1\right)$ and $4\pi/\left(2m+1\right)$ in Figure 6. When $\phi\in\left[\theta_{*},2\pi-\theta_{*}\right]$ , by Lemma 1, the periodogram $\left|\mathrm{Re}\left\{\sum_{k=1}^{k_{\mathrm{max}}}W^{\left(k\right)}_{ij}e^{-\iota k\phi}\right\}\right|$ will not exceed

[TABLE]

On the other hand, again by Lemma 1, the periodogram $\left|\mathrm{Re}\left\{\sum_{k=1}^{k_{\mathrm{max}}}W^{\left(k\right)}_{ij}e^{-\iota k\phi}\right\}\right|$ stays above

[TABLE]

Therefore, as long as the upper bound (29) is no greater than the lower bound (30), which one can check is satisfied if condition (19) in the state of the lemma holds, i.e., if

[TABLE]

then the peak location of the periodogram $\left|\mathrm{Re}\left\{\sum_{k=1}^{k_{\mathrm{max}}}W^{\left(k\right)}_{ij}e^{-\iota k\phi}\right\}\right|$ can occur nowhere other than within $\left[0,\theta_{*}\right]\cup\left[2\pi-\theta_{*},2\pi\right]$ , which gives the conclusion

[TABLE]

with $m=k_{\mathrm{max}}$ . This completes the proof.

∎

A.3 Proof of Theorem 2

Proof of Theorem 2.

First, we note that the second part of the theorem about $\hat{x}$ follows directly from Proposition 1 of (Liu et al., 2017), as in the proof of Lemma 8 of (Zhong & Boumal, 2018).

Assuming for the moment that the key assumption in Lemma 2 is satisfied, namely, $n$ and $k_{\mathrm{max}}$ have been chosen such that

[TABLE]

With a union bound over each of the $\mathcal{O}\left(n^{2}\right)$ estimated relative phases $\hat{\theta}_{ij}$ obtained at the end of the Step 2 of Algorithm 1, with probability at least $1-\mathcal{O}\left(n^{2}\cdot n^{-\left(2+\epsilon\right)}\right)=1-\mathcal{O}\left(n^{-\epsilon}\right)$ we have for all $\left(i,j\right)\in E$

[TABLE]

and thus

[TABLE]

Therefore,

[TABLE]

where the last equality follows from bounding each entry of $H-zz^{*}$ individually using the rightmost term in (32). (Note that by doing so we do not need any information on the randomness of $H-zz^{*}$ .) By the Davis–Kahan $\sin\Theta$ Theorem in Lemma 11 of (Zhong & Boumal, 2018), as long as $n>\left\|\widehat{H}_{ij}-zz^{*}\right\|_{2}$ , which we know from (33) that can be guaranteed if $k_{\mathrm{max}}>2\pi-1/5\approx 5.7832$ , the angle $\theta\left(\hat{u},z\right)$ between $\hat{u}$ and $z$ satisfies

[TABLE]

where in the last inequality we used the fact that $\left(2-0.01\right)k_{\mathrm{max}}\geq 4\pi-1$ for all $k_{\mathrm{max}}\geq 6$ . Therefore, setting $C_{3}:=\left(400\sqrt{2}\,\pi\right)^{2}$ , we have

[TABLE]

Now we seek lower bound for $n$ and $k_{\mathrm{max}}$ that satisfies (31) under the condition $\sigma<c_{0}\sqrt{n/\log n}$ imposed in Lemma 1. Obviously, (31) is satisfied if

[TABLE]

Using the elementary inequality (Kroopnick, 1997)

[TABLE]

we know that a sufficient condition for (34) to hold is

[TABLE]

which is further equivalent to

[TABLE]

Note that for all $k_{\mathrm{max}}\geq 2$ we have $2k_{\mathrm{max}}+1>\pi$ , and thus $\left(2k_{\mathrm{max}}+1\right)^{2}+\pi^{2}<2\left(2k_{\mathrm{max}}+1\right)^{2}$ . Therefore, a sufficient condition for (36) to hold is

[TABLE]

∎

Appendix B Extra Numerical Results

We consider the incomplete graph structure with $n=100$ vertices under Erdős–Renyi graph model and the edge connection probability $p=0.23$ for the following experiments. Figure 7 shows that Algorithm 1 (PPE-SPC) is also robust for incomplete graphs.

Figures 8 and 9 show the performance of our PPE-SPC and its variant PPE-SPC3 on complete graph with $n=500$ vertices.

Bibliography33

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Abbe et al. (2017) Abbe, E., Fan, J., Wang, K., and Zhong, Y. Entrywise eigenvector analysis of random matrices with low expected rank. ar Xiv preprint ar Xiv:1709.09565 , 2017.
2Bandeira et al. (2015) Bandeira, A., Chen, Y., and Singer, A. Non-unique games over compact groups and orientation estimation in cryo-EM. ar Xiv preprint ar Xiv:1505.03840 , 2015.
3Bandeira et al. (2016) Bandeira, A. S., Kennedy, C., and Singer, A. Approximating the little Grothendieck problem over the orthogonal and unitary groups. Mathematical Programming , 160(1-2):433–475, 2016.
4Bandeira et al. (2017) Bandeira, A. S., Boumal, N., and Singer, A. Tightness of the maximum likelihood semidefinite relaxation for angular synchronization. Mathematical Programming , 163(1-2):145–167, 2017.
5Boumal (2016) Boumal, N. Nonconvex Phase Synchronization. SIAM Journal on Optimization , 26(4):2355–2377, 2016.
6Bresler & Macovski (1986) Bresler, Y. and Macovski, A. Exact maximum likelihood parameter estimation of superimposed exponential signals in noise. IEEE Transactions on Acoustics, Speech, and Signal Processing , 34(5):1081–1089, 1986.
7Candes et al. (2015) Candes, E. J., Li, X., and Soltanolkotabi, M. Phase Retrieval via Wirtinger Flow: Theory and Algorithms. IEEE Transactions on Information Theory , 61(4):1985–2007, 2015.
8Chaudhury et al. (2015) Chaudhury, K. N., Khoo, Y., and Singer, A. Global Registration of Multiple Point Clouds Using Semidefinite Programming. SIAM Journal on Optimization , 25(1):468–501, 2015.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Multi-Frequency Phase Synchronization

Abstract

1 Introduction

Motivation

Notations

2 Related Work

Phase synchronization

Phase synchronization in multiple frequency channels

3 Algorithm

3.1 Stage One: Initialization Strategy

3.2 Stage Two: Iterative Refinement

4 Analysis

Assumption 1**.**

Lemma 1**.**

Lemma 2**.**

Theorem 2**.**

5 Extension to General Synchronization

6 Numerical Experiments

7 Conclusion

Acknowledgements

Appendix A Technical Proofs

A.1 Proof of Lemma 1

Proof of Lemma 1.

A.2 Proof of Lemma 2

Proof of Lemma 2.

A.3 Proof of Theorem 2

Proof of Theorem 2.

Appendix B Extra Numerical Results

Assumption 1.

Lemma 1.

Lemma 2.

Theorem 2.