Complex phase retrieval from subgaussian measurements

Felix Krahmer; Dominik St\"oger

arXiv:1906.08385·cs.IT·July 21, 2020

Complex phase retrieval from subgaussian measurements

Felix Krahmer, Dominik St\"oger

PDF

TL;DR

This paper demonstrates that the PhaseLift method can reliably reconstruct signals from subgaussian measurements in complex phase retrieval, even without small-ball probability assumptions, extending previous real-valued results.

Contribution

It extends phase retrieval guarantees for the PhaseLift method to complex signals with subgaussian measurements, without requiring small-ball probability conditions.

Findings

01

PhaseLift successfully reconstructs signals from subgaussian measurements.

02

The proof introduces new techniques applicable to various measurement scenarios.

03

Reconstruction is possible under minimal assumptions, extending prior real-valued results.

Abstract

Phase retrieval refers to the problem of reconstructing an unknown vector $x_{0} \in C^{n}$ or $x_{0} \in R^{n}$ from $m$ measurements of the form $y_{i} = ⟨ ξ^{(i)}, x_{0} ⟩^{2}$ , where ${ξ^{(i)}}_{i = 1}^{m} \subset C^{m}$ are known measurement vectors. While Gaussian measurements allow for recovery of arbitrary signals provided the number of measurements scales at least linearly in the number of dimensions, it has been shown that ambiguities may arise for certain other classes of measurements ${ξ^{(i)}}_{i = 1}^{m}$ such as Bernoulli measurements or Fourier measurements. In this paper, we will prove that even when a subgaussian vector $ξ^{(i)} \in C^{m}$ does not fulfill a small-ball probability assumption, the PhaseLift method is still able to…

Equations232

y_{i} = ∣ ⟨ ξ^{(i)}, x_{0} ⟩ ∣^{2} + w_{i} (where i \in [m]),

y_{i} = ∣ ⟨ ξ^{(i)}, x_{0} ⟩ ∣^{2} + w_{i} (where i \in [m]),

P (∣ ⟨ x, ξ^{(i)} ⟩ ∣ \leq ε ∥ x ∥) \leq c ε .

P (∣ ⟨ x, ξ^{(i)} ⟩ ∣ \leq ε ∥ x ∥) \leq c ε .

\mathbb{E}\left[\big{|}\xi^{\left(i\right)}_{j}\big{|}^{4}\right]>\mathbb{E}\left[\big{|}\xi^{\left(i\right)}_{j}\big{|}^{2}\right]

\mathbb{E}\left[\big{|}\xi^{\left(i\right)}_{j}\big{|}^{4}\right]>\mathbb{E}\left[\big{|}\xi^{\left(i\right)}_{j}\big{|}^{2}\right]

\frac{∥ x _{0} ∥ _{\infty}}{∥ x _{0} ∥} \leq μ < 1

\frac{∥ x _{0} ∥ _{\infty}}{∥ x _{0} ∥} \leq μ < 1

y_{i} = Tr (ξ^{(i)} (ξ^{(i)})^{*} X_{0}) + w_{i},

y_{i} = Tr (ξ^{(i)} (ξ^{(i)})^{*} X_{0}) + w_{i},

\begin{split}\text{minimize }\quad&\sum_{i=1}^{m}\big{|}\text{Tr}\,\left(\xi^{\left(i\right)}(\xi^{\left(i\right)})^{*}X\right)-y_{i}\big{|}\\ \text{ such that}\quad&X\in\mathcal{S}^{n}_{+}.\end{split}

\begin{split}\text{minimize }\quad&\sum_{i=1}^{m}\big{|}\text{Tr}\,\left(\xi^{\left(i\right)}(\xi^{\left(i\right)})^{*}X\right)-y_{i}\big{|}\\ \text{ such that}\quad&X\in\mathcal{S}^{n}_{+}.\end{split}

A (Z) (i) := ⟨ ξ^{(i)} (ξ^{(i)})^{*}, Z ⟩_{H S} .

A (Z) (i) := ⟨ ξ^{(i)} (ξ^{(i)})^{*}, Z ⟩_{H S} .

minimize such that ∥ A (X) - y ∥_{ℓ_{1}} X \in S_{+}^{n} .

minimize such that ∥ A (X) - y ∥_{ℓ_{1}} X \in S_{+}^{n} .

in f {t > 0 : E [exp (X^{2} / t^{2})] \leq 2} \leq K < + \infty.

in f {t > 0 : E [exp (X^{2} / t^{2})] \leq 2} \leq K < + \infty.

∥ X ∥_{L_{p}} ≲ p K .

∥ X ∥_{L_{p}} ≲ p K .

K ≳ 1.

K ≳ 1.

m \geq C_{1} n,

m \geq C_{1} n,

∥ \hat{X} - x_{0} x_{0}^{*} ∥_{H S} \leq C_{3} \frac{∥ w ∥ _{ℓ_{1}}}{m} .

∥ \hat{X} - x_{0} x_{0}^{*} ∥_{H S} \leq C_{3} \frac{∥ w ∥ _{ℓ_{1}}}{m} .

X_{μ} := {x_{0} \in C^{n} ∖ {0} : ∥ x_{0} ∥_{\infty} \leq μ ∥ x_{0} ∥} .

X_{μ} := {x_{0} \in C^{n} ∖ {0} : ∥ x_{0} ∥_{\infty} \leq μ ∥ x_{0} ∥} .

m \geq C_{1} \frac{K ^{20}}{β ^{5/2}} n .

m \geq C_{1} \frac{K ^{20}}{β ^{5/2}} n .

∥ \hat{X} - x_{0} x_{0}^{*} ∥_{1} \leq C_{3} \frac{K ^{8}}{m β ^{5/2}} ∥ w ∥_{ℓ_{1}} .

∥ \hat{X} - x_{0} x_{0}^{*} ∥_{1} \leq C_{3} \frac{K ^{8}}{m β ^{5/2}} ∥ w ∥_{ℓ_{1}} .

x_{1}

x_{1}

\tilde{x}_{1}

\displaystyle\big{|}\langle\xi,x_{0}\rangle\big{|}^{2}

\displaystyle\big{|}\langle\xi,x_{0}\rangle\big{|}^{2}

\displaystyle=\big{|}\langle\tilde{\xi},x_{0}\rangle\big{|}^{2}

\displaystyle=\big{|}\langle\tilde{\xi},\overline{x_{0}}\rangle\big{|}^{2}

\displaystyle=\big{|}\langle\xi,\overline{x_{0}}\rangle\big{|}^{2}.

minimize such that ∥ A (X) - y ∥_{ℓ_{1}} X \in S_{+}^{n} \cap R^{n \times n} .

minimize such that ∥ A (X) - y ∥_{ℓ_{1}} X \in S_{+}^{n} \cap R^{n \times n} .

m \geq C_{1} K^{20} n .

m \geq C_{1} K^{20} n .

∥ \hat{X} - x_{0} x_{0}^{*} ∥_{1} \leq C_{3} K^{8} m ∥ w ∥_{ℓ_{1}} .

∥ \hat{X} - x_{0} x_{0}^{*} ∥_{1} \leq C_{3} K^{8} m ∥ w ∥_{ℓ_{1}} .

m \geq C_{1} \frac{K ^{20}}{β ^{5/2}} n,

m \geq C_{1} \frac{K ^{20}}{β ^{5/2}} n,

∥ \hat{X} - x_{0} x_{0}^{*} ∥_{1} \leq C_{3} \frac{K ^{8}}{m β ^{5/2}} ∥ w ∥_{ℓ_{1}} .

∥ \hat{X} - x_{0} x_{0}^{*} ∥_{1} \leq C_{3} \frac{K ^{8}}{m β ^{5/2}} ∥ w ∥_{ℓ_{1}} .

∥ A (X) - y ∥_{ℓ_{1}} \leq ∥ A (x_{0} x_{0}^{*}) - y ∥_{ℓ_{1}} = ∥ w ∥_{ℓ_{1}}

∥ A (X) - y ∥_{ℓ_{1}} \leq ∥ A (x_{0} x_{0}^{*}) - y ∥_{ℓ_{1}} = ∥ w ∥_{ℓ_{1}}

∥ X - x_{0} x_{0}^{*} ∥_{1} \leq C_{3} \frac{K ^{8}}{m β ^{5/2}} ∥ w ∥_{ℓ_{1}} .

∥ X - x_{0} x_{0}^{*} ∥_{1} \leq C_{3} \frac{K ^{8}}{m β ^{5/2}} ∥ w ∥_{ℓ_{1}} .

∥ A (Z) - w ∥_{ℓ_{1}} \leq ∥ w ∥_{ℓ_{1}} .

∥ A (Z) - w ∥_{ℓ_{1}} \leq ∥ w ∥_{ℓ_{1}} .

∥ A (Z) ∥_{ℓ_{1}} < 2∥ w ∥_{ℓ_{1}} .

∥ A (Z) ∥_{ℓ_{1}} < 2∥ w ∥_{ℓ_{1}} .

M_{μ} := cone {Z \in S^{n} : \exists x_{0} \in X_{μ} such that x_{0} x_{0}^{*} + Z \in S_{+}^{n}} .

M_{μ} := cone {Z \in S^{n} : \exists x_{0} \in X_{μ} such that x_{0} x_{0}^{*} + Z \in S_{+}^{n}} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Complex phase retrieval from subgaussian measurements

Felix Krahmer

Dominik Stöger

Abstract

Phase retrieval refers to the problem of reconstructing an unknown vector $x_{0}\in\mathbb{C}^{n}$ or $x_{0}\in\mathbb{R}^{n}$ from $m$ measurements of the form $y_{i}=\big{|}\langle\xi^{\left(i\right)},x_{0}\rangle\big{|}^{2}$ , where $\left\{\xi^{\left(i\right)}\right\}^{m}_{i=1}\subset\mathbb{C}^{m}$ are known measurement vectors. While Gaussian measurements allow for recovery of arbitrary signals provided the number of measurements scales at least linearly in the number of dimensions, it has been shown that ambiguities may arise for certain other classes of measurements $\left\{\xi^{\left(i\right)}\right\}^{m}_{i=1}$ such as Bernoulli measurements or Fourier measurements. In this paper, we will prove that even when a subgaussian vector $\xi^{\left(i\right)}\in\mathbb{C}^{m}$ does not fulfill a small-ball probability assumption, the PhaseLift method is still able to reconstruct a large class of signals $x_{0}\in\mathbb{R}^{n}$ from the measurements. This extends recent work by Krahmer and Liu from the real-valued to the complex-valued case. However, our proof strategy is quite different and we expect some of the new proof ideas to be useful in several other measurement scenarios as well. We then extend our results $x_{0}\in\mathbb{C}^{n}$ up to an additional assumption which, as we show, is necessary.

1 Introduction

Phase retrieval refers to the problem of reconstructing an unknown vector $x_{0}\in\mathbb{C}^{n}$ from $m$ measurements of the form

[TABLE]

where the $\xi^{\left(i\right)}\in\mathbb{C}^{n}$ are known measurement vectors and $w_{i}\in\mathbb{R}$ represents additive noise. Such problems are ubiquituous in many areas of science and engineering such as X-ray crystallography [23, 32], astronomical imaging [18], ptychography [35], and quantum tomography [28].

The foundational papers [7, 13, 4] proposed to reconstruct $x_{0}$ via the PhaseLift method, a convex relaxation of the original problem. These papers have triggered many follow-up works since they were the first to establish rigorous recovery guarantees under the assumption that the measurement vectors $\xi^{\left(i\right)}$ are sampled uniformly at random from the sphere. Since then several papers have analyzed scenarios where the measurement vectors possess a significantly reduced amount of randomness, in particular spherical designs [21] and coded diffraction patterns [5, 22]. However, the theoretical results for coded diffraction patterns rely on the assumption that the modulus of the illumination patterns is varying. Indeed, it was shown in [17] that for certain illumination patterns with constant modulus ambiguities can arise, i.e., it is not possible to determine $x_{0}$ uniquely from the measurements $y_{i}$ . In fact, such ambiguities can already arise in much simpler settings, where the measurement vectors $\left(\xi^{\left(i\right)}\right)$ are i.i.d. subgaussian. For example, consider the case that $\xi^{\left(i\right)}=\left(\varepsilon^{\left(i\right)}_{1},\ldots,\varepsilon^{\left(i\right)}_{n}\right)$ , where the $\varepsilon^{\left(i\right)}_{j}$ are i.i.d. Rademacher random variables. That is, they only take the values $+1$ and $-1$ each with probability $\frac{1}{2}$ . In this case the vector $x_{0}:=e_{1}=\left(1,0,\ldots,0\right)$ can never be distinguished from the vector $\tilde{x}_{0}:=e_{2}=\left(0,1,\ldots,0\right)$ . Note that in this scenario it holds that $\mathbb{E}\left[\big{|}\xi^{\left(i\right)}_{j}\big{|}^{4}\right]=\mathbb{E}\left[\big{|}\xi^{\left(i\right)}_{j}\big{|}^{2}\right]$ and, hence, the vector $\xi^{\left(i\right)}$ does not fulfill a small-ball probability assumption, which means that there is no constant $c>0$ such that for all $\varepsilon>0$ and for all vectors $x$ it holds that

[TABLE]

When the signals are complex even additional classes of ambiguities can arise. For example, when the measurement vectors $\xi^{\left(i\right)}$ are real, any signal $x$ and its complex-conjugate signal $\overline{x}$ will result in identical observations.

For these reasons, previous works on phase-retrieval from subgaussian measurements (see, e.g., [11]) work with real signals and require that all entries of the vector $\xi^{\left(i\right)}$ fulfill

[TABLE]

for all $j\in\left[n\right]$ or make even stronger assumptions.

The only exception is [26] which shows for the real-valued case ( $x_{0}\in\mathbb{R}^{n}$ and $\xi^{\left(i\right)}\in\mathbb{R}^{n}$ ) PhaseLift recovers a large class of signals from subgaussian measurements even if estimates of the type (1.3) are not satisfied. More precisely, one obtains that all signals $x_{0}$ whose peak-to-average power ratio satisfies a mild bound of the form

[TABLE]

for some absolute constant $\mu>0$ , can be recovered with high probability as long as $m\gtrsim n$ . However, as the approach in [26] is intrinsically based on arguments in [16] it cannot be generalized to the complex case in a straightforward manner. This paper provides an analysis both for real-valued and complex-valued signals. We believe that this understanding will be of importance for the subsequent study of structured scenarios such as coded diffraction patterns, which are also intrinsically complex in nature.

While the proofs in previous papers [5, 21, 22, 26] relied on the construction of a so-called dual certificate, our paper will employ a more geometric approach based on Mendelson’s small ball method [25, 31]. This is motivated by recent work [28, 24, 27], which showed that a geometric analysis based on the descent cone of the trace norm can often yield additional insights compared to an approach based on dual certificates.

For the problem studied in this paper, however, the small-ball method cannot be applied directly to the entire descent cone or the entire cone of directions in which positive semidefiniteness is preserved. Rather we divide the latter cone into two parts: One that contains all the problematic cases, but is small, and one that is larger, but easier to analyze. Then we control one of these cones using a restricted isometry property and one via the small-ball method.

We think that this novel viewpoint and also some of the techniques developed in this paper will be useful for the analysis of other interesting measurement scenarios, such as the case of heavy-tailed measurement vectors $\xi^{\left(i\right)}$ or the case that $\xi^{\left(i\right)}$ has only entries [math] and $1$ .

2 Background and main results

2.1 Notation

$\mathcal{S}^{n}$ denotes the vector space of all Hermitian matrices in $\mathbb{C}^{n\times n}$ . By $\mathcal{S}^{n}_{+}\subset\mathcal{S}^{n}$ we will denote the set of all positive definite Hermitian matrices. For $A,B\in\mathcal{S}^{n}$ the Hilbert-Schmidt inner product is defined by $\langle A,B\rangle_{HS}:=\text{Tr}\,\left(A^{*}B\right)$ . The corresponding norm will be denote by $\|\cdot\|_{HS}$ . For a matrix $Z\in\mathcal{S}^{n}$ we will denote their eigenvalues by $\lambda_{1}\left(Z\right),\lambda_{2}\left(Z\right),\ldots,\lambda_{n}\left(Z\right)$ , which are assumed to be arranged in decreasing order, i.e., $\lambda_{1}\left(Z\right)\geq\lambda_{2}\left(Z\right)\geq\ldots\geq\lambda_{n}\left(Z\right)$ . If no confusion can arise, we will suppress the dependence on $Z$ and write $\lambda_{i}$ instead of $\lambda_{i}\left(Z\right)$ . By $\|Z\|_{1}$ we will denote the Schatten- $1$ norm of $Z$ , i.e. $\|Z\|_{1}:=\sum_{i=1}^{n}|\lambda_{i}\left(Z\right)|$ . By $\text{diag}\left(Z\right)\in\mathcal{S}^{n}$ we denote the matrix, which we obtain by setting all off-diagonal entries of $Z$ equal to zero. We will write $a\lesssim b$ or $b\gtrsim a$ if there is a universal constant $C>0$ such that $a\leq Cb$ .

2.2 PhaseLift

The PhaseLift method was first introduced in [7]. In this paper we focus on a variant [4, 13] based on the observation that the measurements $y_{i}$ can be rewritten in the form

[TABLE]

where $X_{0}=x_{0}x^{*}_{0}$ is a rank- $1$ matrix encoding the signal to be recovered up to the true inherent phase ambiguity. From this observation, PhaseLift relaxes the constraint that $X_{0}$ is of rank $1$ to obtain the optimization problem

[TABLE]

In order to simplify notation we introduce the linear operator $\mathcal{A}:\mathcal{S}^{n}\rightarrow\mathbb{R}^{m}$ as

[TABLE]

Hence, setting $y:=\left(y_{1},\ldots,y_{m}\right)\in\mathbb{R}^{m}$ , (2.2) can be rewritten as

[TABLE]

We note that while understanding the relaxation (2.4) is an important benchmark approach and can be solved in polynomial time, it is typically not practical for applications, as lifting increases the number of optimization variables. For this reason, a very active line of research study recovery guarantees for algorithms that operate in the natural parameter domain such as alternating minimization (see, e.g., [34, 43]), gradient-descent based formulations (see, e.g., [6, 9, 38, 39, 10]), and anchored regression [2, 1, 3, 20]. However, most of these guarantees have been shown under the assumption that the measurement vectors $\left\{\xi^{\left(i\right)}\right\}_{i=1}^{m}$ are sampled i.i.d. from the unit sphere, so it will be a natural follow-up of this work to study to which extent our results generalize to the more practical nonconvex algorithms. In particular, most reconstruction guarantees for these non-convex approaches require an appropriate initialization. For this reason, one needs to study which initialization schemes work for the measurements considered in this paper. A natural approach will be to try spectral initializations and recent generalizations that have been shown to be feasible for a basically minimal number of measurements [29, 33, 15, 30]. We expect that the analysis provided in this paper will prove useful for this endeavour as the spectral initialization is somewhat connected to trace-norm minimization.

2.3 Subgaussian measurements

We consider random measurement vectors $\left\{\xi^{\left(i\right)}\right\}^{m}_{i=1}$ given as independent copies of a random vector $\xi$ , whose entries $\xi_{j}$ are assumed to be i.i.d. subgaussian random variables with parameter $K$ , expectation $\mathbb{E}\left[\xi_{j}\right]=0$ , and variance $\mathbb{E}\left[|\xi_{j}|^{2}\right]=1$ . Recall that a random variable $X$ is subgaussian with parameter $K$ , if and only if

[TABLE]

It is well known (see, e.g., [42]) that from this definition it follows for any (measurable) random variable $X$ that

[TABLE]

Since $\|\xi_{1}\|_{L_{2}}^{2}=\mathbb{E}\left[|\xi_{1}|^{2}\right]=1$ inequality (2.6) immediately implies that

[TABLE]

Moreover, it is well known (see, e.g., [42]) that for all $x\in\mathbb{C}^{n}$ the random variable $\langle x,\xi\rangle$ is subgaussian with parameter $K\|x\|$ .

2.4 Previous work

A number of previous works have studied phase retrieval with subgaussian measurements in the real-valued setting , i.e., $x_{0}\in\mathbb{R}^{n}$ and $\xi\in\mathbb{R}^{n}$ . For measurements fulfilling $\mathbb{E}\left[\big{|}\xi_{j}\big{|}^{4}\right]>\mathbb{E}\left[\big{|}\xi_{j}\big{|}^{2}\right]$ , [11] showed that PhaseLift admits order optimal uniform recovery guarantees.111In [11] instead of (2.4) the original PhaseLift approach as in [7] is analysed. Without the assumption $\mathbb{E}\left[\big{|}\xi_{j}\big{|}^{4}\right]>\mathbb{E}\left[\big{|}\xi_{j}\big{|}^{2}\right]$ , in [26] the following result was proven, again for the real-valued case.

Theorem 1.

[26, Theorem V.1]** Let $\xi=\left(\xi_{1},\ldots,\xi_{n}\right)\in\mathbb{R}^{n}$ be a random vector with i.i.d. subgaussian entries. Then there exist constants $C_{1}$ , $C_{2}$ , $C_{3}$ , and $0<\mu<1$ , which depend only on the distribution of $\xi_{1}$ , such that whenever

[TABLE]

the following statement holds with probability at least $1-\exp\left(-C_{2}n\right)$ : For all signals $x_{0}\in\mathbb{R}^{n}$ with $\|x_{0}\|_{\infty}\leq\mu\|x_{0}\|$ and all noise vectors $w\in\mathbb{R}^{m}$ any minimizer of (2.4) fulfills

[TABLE]

3 Main results

3.1 Complex signals and complex measurement vectors

In Theorem 1 both the signal $x_{0}$ and the measurement vectors $\xi^{\left(i\right)}$ are assumed to be real. While for the measurement vectors this is often too restrictive, the signal $x_{0}$ is indeed typically real-valued in applications. This important special case will be discussed in Section 3.2 below. Nevertheless, we find it still interesting from a mathematical point of view under which assumptions recovery is possible for complex-valued signals. Our first result deals with this case.

As we have explained in Section 1, there are subgaussian distributions for which we cannot achieve uniform recovery of all signals $x_{0}\in\mathbb{C}^{n}$ . For this reason, we define for all $0<\mu\leq 1$ the set of all signals of mildly bounded peak-to-average power ratio

[TABLE]

Indeed, this restriction is very mild as $\mu$ will not depend on the dimension, whereas for a Gaussian random signal the ratio $\frac{\|x\|_{\infty}}{\|x\|}$ would scale like $\sqrt{\frac{\log n}{n}}$ . Now we are prepared to state the following theorem, which is our first main result.

Theorem 2.

Let the observation vector $y$ be given as in (1.1), where the random measurement vectors $\left\{\xi^{\left(i\right)}\right\}_{i=1}^{m}$ are defined as in Section 2.3. Assume that $|\mathbb{E}\left[\xi_{1}^{2}\right]|^{2}\leq 1-\beta$ for some $\beta\in\left(0,1\right)$ and that

[TABLE]

Then for some probability parameter $p_{\beta}=1-\mathcal{O}\left(\exp\left(\frac{-m\beta^{4}}{C_{2}K^{16}}\right)\right)$ the following two statements hold.

With probability at least $p_{\beta}$ one has that for all vectors $x_{0}\in\mathcal{X}_{1/81}$ and any noise vector $w\in\mathbb{R}^{m}$ any solution $\hat{X}$ of (2.4) satisfies

[TABLE] 2. 2.

If, in addition, $\mathbb{E}\left[|\xi_{1}|^{4}\right]\geq 1+\beta$ , then with probability at least $p_{\beta}$ inequality (3.3) holds for all $x_{0}\in\mathbb{C}^{n}\setminus\left\{0\right\}$ .

Here $C_{1}$ , $C_{2}$ , and $C_{3}$ are universal constants.

The first case of Theorem 2, where one makes no assumption on the fourth moment of $\xi_{1}$ , can be applied also to certain scenarios, where unique recovery is not possible without this assumption. One important example is that the entries $\xi_{i}$ are drawn from $\left\{z\in\mathbb{C}:\ \|z\|=1\right\}$ uniformly at random. Note that these measurements will always yield the same observations $y$ for the two signals

[TABLE]

Such very sparse signals are exactly prevented by Condition 1, so there is no contradiction to the theorem’s conclusion that unique recovery can be achieved via (2.4) for all signals $x_{0}$ such that $\|x_{0}\|_{\infty}\leq\frac{1}{82}\|x_{0}\|$ .

Note that in the second scenario, where assumptions on the fourth moment of $\xi_{1}$ are available, we obtain a uniform recovery result over all $x_{0}\in\mathbb{C}^{n}$ . In the real-valued case a similar result has been shown in [11].

Remark 1.

An assumption of the form $|\mathbb{E}\left[\xi_{1}^{2}\right]|^{2}\leq 1-\beta$ cannot be avoided as the following argument shows. Indeed, if $|\mathbb{E}\left[\xi_{1}^{2}\right]|^{2}=1$ the assumption $\mathbb{E}\left[|\xi_{1}^{2}|\right]=1$ implies that $\xi=\lambda\tilde{\xi}$ almost surely, where $\lambda\in\left\{z\in\mathbb{C}:\|z\|=1\right\}$ is fixed and $\tilde{\xi}\in\mathbb{R}$ is a real random variable. We observe that

[TABLE]

Consequently, $x_{0}$ and its complex-conjugate $\overline{x_{0}}$ will always lead to the same measurements.

3.2 Real signals and complex measurement vectors

We have seen in Remark 1 that the assumption $|\mathbb{E}\left[\xi_{1}^{2}\right]|^{2}\leq 1-\beta$ is necessary to distinguish between a signal $x_{0}$ and $\overline{x_{0}}$ . However, if, as in many practical applications, it is known a priori that the signal $x_{0}$ is real-valued then this ambiguity cannot arise and we can uniquely recover without additional assumptions via the following natural variant of the PhaseLift method, where we restrict the search space to real-valued matrices.

[TABLE]

The following theorem shows that in this scenario the assumption $|\mathbb{E}\left[\xi_{1}^{2}\right]|^{2}\leq 1-\beta$ is indeed not necessary.

Theorem 3.

Let the observation vector $y$ be given as in (1.1), where the random measurement vectors $\left\{\xi^{\left(i\right)}\right\}_{i=1}^{m}$ are as defined in Section 2.3. Then the following two statements hold.

Assume that

[TABLE]

Then, with probability at least $1-\mathcal{O}\left(\exp\left(\frac{-m}{C_{2}K^{16}}\right)\right)$ one has that for all vectors $x_{0}\in\mathcal{X}_{1/81}\cap\mathbb{R}^{n}$ and any noise vector $w\in\mathbb{R}^{m}$ any solution $\hat{X}$ of (3.10) satisfies

[TABLE] 2. 2.

If, in addition, it holds that $\mathbb{E}\left[|\xi_{1}|^{4}\right]\geq 1+\beta$ for some $\beta\in(0,1]$ , then, under the refined assumption

[TABLE]

one has a more general bound. Namely it holds that with probability at least $1-\mathcal{O}\left(\exp\left(\frac{-m\beta^{4}}{C_{2}K^{16}}\right)\right)$ and for all vectors $x_{0}\in\mathbb{R}^{n}\backslash\left\{0\right\}$ , again for arbitrary noise vectors $w\in\mathbb{R}^{m}$ , any solution $\hat{X}$ of (3.10) satisfies

[TABLE]

Here $C_{1}$ , $C_{2}$ , and $C_{3}$ are universal constants.

Remark 2.

In comparison to Theorem 1 the probability bound in Theorem 2 and Theorem 3 is slightly better, as it improves from $1-\exp\left(-\Omega\left(n\right)\right)$ to $1-\exp\left(-\Omega\left(m\right)\right)$ . Moreover, note that in contrast to Theorem 1 the dependence on the subgaussian distribution of $\xi$ is not hidden in the constants. Also note that in our result the dependence on $\beta$ is stated explicitly. However, we do not know whether these bounds are optimal with respect to $K$ and $\beta$ .

4 Proof of main results

4.1 Proof of Theorem 2

Our goal is to show that with high probability the matrix $x_{0}x_{0}^{*}$ is close to the minimizer $\hat{X}$ of the expression $\|\mathcal{A}(W)-y\|_{\ell_{1}}$ over all $W\in\mathcal{S}^{n}_{+}$ . A common proof strategy that we will also follow is to establish that all $X\in\mathcal{S}^{n}_{+}$ with

[TABLE]

are sufficiently close to the true solution in $\|\cdot\|_{1}$ -norm. More precisely, a sufficient condition for inequality (3.3) is that every $X$ fulfilling condition (4.1) satisfies

[TABLE]

Setting $Z=X-x_{0}x_{0}^{*}$ , equation (4.1) reads

[TABLE]

By the triangle inequality this implies that

[TABLE]

Hence, the upper bound (4.2) that we aim to establish directly follows from an appropriate lower bound for $\|\mathcal{A}\left(Z\right)\|_{\ell_{1}}/\|Z\|_{1}$ . Here $Z\in\mathcal{S}^{n}$ ranges over those matrices for which $x_{0}x^{*}_{0}+Z$ is positive semidefinite. This set is convex, so it is locally well-approximated by a convex cone. To establish a uniform recovery result over all $x_{0}\in\mathcal{X}_{\mu}$ , we need to study the union of the corresponding cones as given by

[TABLE]

We will refer to this set as the cone of admissible directions.

With this notation, our proof strategy can be summarized as establishing a lower bound for

[TABLE]

which in the literature is commonly referred to as the minimum conic singular value (see, e.g., [40, 27]). Except for the precise nature of the cone under consideration, this strategy is exactly analogous to a number of works in the recent literature on linear inverse problems [8, 28]. In particular, the following lemma, which summarizes our motivating considerations above, can be seen as a variant of [8, Proposition 2.2].

Lemma 1.

Let $\mathcal{A}$ be the operator defined in (2.3). Assume that $y=\mathcal{A}\left(x_{0}x^{*}_{0}\right)+w$ . Then the minimizer $\hat{X}$ of (2.4) satisfies

[TABLE]

In the following, our goal will be to derive an appropriate lower bound for $\lambda_{\min}\left(\mathcal{A},\mathcal{M}_{\mu}\right)$ . One difficulty in the analysis is that not all matrices belonging to $\mathcal{M}_{\mu}$ are positive semidefinite. Indeed, in this scenario one could use that for positive semidefinite matrices an approximate $\ell_{1}$ isometry holds (see, e.g. [7, Section 3]). While not all matrices in $\mathcal{M}_{\mu}$ are positive semidefinite the following lemma states that each matrix belonging to $\mathcal{M}_{\mu}$ possesses at most one negative eigenvector.

Lemma 2.

Suppose that $Z\in\mathcal{M}_{\mu}$ . Then $Z$ has at most one strictly negative eigenvalue.

Proof.

Let $Z\in\mathcal{M}_{\mu}$ . By definition of $\mathcal{M}_{\mu}$ we can find $x_{0}\in\mathcal{X}_{\mu}$ and $t>0$ such that

[TABLE]

Suppose now by contradiction that $Z$ has two (strictly) negative eigenvalues with corresponding eigenvectors $z_{1},z_{2}\in\mathbb{C}^{n}$ . Then we can find a vector $u\in\text{span}\left\{z_{1},z_{2}\right\}\backslash\left\{0\right\}$ such that $\langle u,x_{0}\rangle=0$ . This implies that for any $t>0$ we have that

[TABLE]

which is a contradiction to (4.8). ∎

Recall that for a matrix $Z\in\mathcal{S}^{n}$ we denoted its eigenvalues by $\left\{\lambda_{i}\left(Z\right)\right\}^{n}_{i=1}$ in decreasing order. By the previous lemma it holds that $\lambda_{i}\left(Z\right)\geq 0$ for all $i\in\left[n-1\right]$ and all $Z\in\mathcal{M}_{\mu}$ . For the proof we will partition $\mathcal{M}_{\mu}$ into two sets. Namely, for $\alpha>0$ we define

[TABLE]

The two sets can be interpreted in the following way. If we would suppose that $\alpha=1$ it would follow that $\text{Tr}\,\left(Z\right)<0$ for all matrices $Z\in\mathcal{M}_{2,\mu,\alpha}$ . In particular, this implies that there is $x_{0}\in\mathcal{X}_{\mu}$ such that $Z$ is in the descent cone of the function $\text{Tr}\,\left(\cdot\right)$ at the point $x_{0}x^{*}_{0}$ . Hence, for $\alpha<1$ we can interpret $\mathcal{M}_{2,\mu,\alpha}$ as a slightly enlarged union of descent cones. In order to bound $\underset{Z\in\mathcal{M}_{2,\mu,\alpha}}{\inf}\|\mathcal{A}\left(Z\right)\|_{\ell_{1}}/\|Z\|_{1}$ from below we will rely on the following lemma, which is proven in Section 6.

Lemma 3.

Assume that one of following two conditions is satisfied for $\beta\in(0,1]$ :

It holds that $|\mathbb{E}\left[\xi_{1}^{2}\right]|^{2}\leq 1-\beta$ . In this case we set $\mu=1/81$ . 2. 2.

In addition to $\big{|}\mathbb{E}\left[\xi_{1}^{2}\right]\big{|}^{2}\leq 1-\beta$ , the inequality $\mathbb{E}\left[|\xi_{1}|^{4}\right]\geq 1+\beta$ , is fulfilled. In this case we set $\mu=1$ .

Moreover, assume that

[TABLE]

Then with probability at least $1-2\exp\left(\frac{-m\beta^{4}}{C_{2}K^{16}}\right)$ it holds that

[TABLE]

where $\alpha=4/5$ . Here $C_{1}$ , $C_{2}$ , and $C_{3}$ are universal constants.

The proof of Lemma 3 makes use of the fact that the set $\mathcal{M}_{2,\mu,\alpha}$ has low complexity in the sense that the matrices in $\mathcal{M}_{2,\mu,\alpha}$ are approximately low-rank.

In contrast, the set $\mathcal{M}_{1,\mu,\alpha}$ has rather high complexity. For example, note that $\mathcal{S}^{n}_{+}\subset\mathcal{M}_{1,\mu,\alpha}$ . Nevertheless, the quantity $\underset{Z\in\mathcal{M}_{1,\mu,\alpha}\setminus\left\{0\right\}}{\inf}\|\mathcal{A}\left(Z\right)\|_{\ell_{1}}/\|Z\|_{1}$ can be bounded from below, because the measurement matrices $\xi^{\left(i\right)}(\xi^{\left(i\right)})^{*}$ are positive semidefinite and the matrices in $\mathcal{M}_{1,\mu,\alpha}$ also have a dominant positive semidefinite component. This is achieved by the following lemma, whose proof can be found in Section 5.

Lemma 4.

Let $0<\mu\leq 1$ , $\alpha>0$ , and $\delta>0$ . Assume that

[TABLE]

Then with probability at least $1-\mathcal{O}\left(\exp\left(-\frac{m}{C_{2}K^{4}}\right)\right)$ for all $Z\in\mathcal{M}_{1,\mu,\alpha}$ it holds that

[TABLE]

Here $C_{1}$ and $C_{2}$ are absolute constants.

We remark that Lemma 4 would no longer hold if the measurement matrices $\xi^{\left(i\right)}(\xi^{\left(i\right)})^{*}$ would be replaced by symmetric matrices with i.i.d. Gaussian entries (see [37, Proposition 1]).

Having gathered all the necessary ingredients we can prove the main result of this manuscript.

Proof of Theorem 2.

Set $\alpha=4/5$ . The proof of the two statements is analogous, except that for the first statement we set $\mu=1/81$ whereas for the second statement we set $\mu=1$ . By Lemma 4 and assumption (3.2) it follows that with probability at least $1-\mathcal{O}\left(\exp\left(-\frac{m}{CK^{4}}\right)\right)$

[TABLE]

Furthermore, by Lemma 3 we have with probability at least $1-2\exp\left(\frac{-m\beta^{4}}{C_{2}K^{16}}\right)$ that

[TABLE]

holds.

Set $Z:=\hat{X}-x_{0}x^{*}_{0}$ . Note that by definition we have that $Z$ is an admissible direction, i.e., $Z\in\mathcal{M}_{\mu}$ . It follows by (4.16), (4.17), and $\mathcal{M}_{\mu}=\mathcal{M}_{1,\mu,\alpha}\cup\mathcal{M}_{2,\mu,\alpha}$ that

[TABLE]

where in the last inequality we used (2.7) and $0<\beta\leq 1$ . It follows by Lemma 1 that

[TABLE]

which finishes the proof. ∎

4.2 Proof of Theorem 3

The proof of Theorem 3 is in large parts analogous to the proof of Theorem 2. For this reason, we will only highlight the main differences. Replacing $\mathcal{X}_{\mu}$ by $\mathcal{X}_{\mu}\cap\mathbb{R}^{n}$ and $\mathcal{M}_{\mu}$ by $\mathcal{M}_{\mu}\cap\mathbb{R}^{n\times n}$ we can argue analogously to Section 4.1 with the only difference that Lemma 3 has to be replaced by the following variant.

Lemma 5.

Assume that one of following two conditions is satisfied for $\mu,\beta\in(0,1]$ :

It holds that $\mu=\frac{1}{81}$ and $\beta=1$ . 2. 2.

It holds that $\mathbb{E}\left[|\xi_{1}|^{4}\right]\geq 1+\beta$ and $\mu=1$ .

Moreover, assume that

[TABLE]

Then with probability at least $1-2\exp\left(\frac{-m\beta^{4}}{C_{2}K^{16}}\right)$ it holds that

[TABLE]

where $\alpha=4/5$ . Here $C_{1}$ , $C_{2}$ , and $C_{3}$ are universal constants.

In order to prove Lemma 5 we can proceed similarly as in the proof of Lemma 3, see Section 6, where in the proof of Lemma 3 we have highlighted the necessary modifications.

5 Proof of Lemma 4

Proof.

Note that for any $z\in\mathbb{C}^{n}$ we have that $\|\mathcal{A}\left(zz^{*}\right)\|_{\ell_{1}}=\sum_{i=1}^{m}|\langle\xi_{i},z\rangle|^{2}$ . Let $A\in\mathbb{C}^{m\times n}$ be the matrix whose rows are given by $\left\{\xi_{i}\right\}_{i=1}^{m}$ . It follows that $\|\mathcal{A}\left(zz^{*}\right)\|_{\ell_{1}}=\|Az\|^{2}$ . It follows from [42, Theorem 4.6.1] that due to our assumption on $m$ with probability at least $1-\mathcal{O}\left(\exp\left(-\frac{m}{CK^{4}}\right)\right)$ for all $z\in\mathbb{C}^{m}$ it holds that

[TABLE]

Due to the observation above this is equivalent to

[TABLE]

for all $z\in\mathbb{C}^{n}$ . We will assume in the following that (5.2) holds for all $z\in\mathbb{C}^{m}$ .

Let $Z\in\mathcal{M}_{1,\mu,\alpha}$ with corresponding eigenvalue decomposition $Z=\sum_{i=1}^{n}\lambda_{i}v_{i}v^{*}_{i}$ . We observe that

[TABLE]

By Lemma 2 we know that $Z$ has at most one negative eigenvalue. If all eigenvalues $\lambda_{i}\left(Z\right)$ are positive, this inequality chain and inequality (5.2) imply that

[TABLE]

which shows (4.15). Now suppose that $\lambda_{n}\left(Z\right)<0$ . By (5.2) and $-\lambda_{n}\left(Z\right)\leq\alpha\sum_{i=1}^{n-1}\lambda_{i}\left(Z\right)$ , which is due to $Z\in\mathcal{M}_{1,\mu,\alpha}$ , we obtain that

[TABLE]

Again using the relation $-\lambda_{n}\left(Z\right)\leq\alpha\sum_{i=1}^{n-1}\lambda_{i}\left(Z\right)$ we can also observe that

[TABLE]

Combining (5.9) and (5.10) shows (4.15), which finishes the proof. ∎

6 Proof of Lemma 3 and Lemma 5

In order to prove Lemma 3 and Lemma 5 we will use the following version of Mendelson’s small ball method [25, 31], a tool for deriving a lower bound for nonnegative empirical process.

Lemma 6.

[14, Lemma 1]** Let $\mathcal{Z}\subset\mathcal{S}^{n}$ and let $\xi^{(1)},\xi^{(2)},\ldots,\xi^{(m)}$ be i.i.d. random vectors. Let $u>0$ and $t>0$ and define

[TABLE]

Then, with probability at least $1-2\exp\left(-2t^{2}\right)$ , it holds that

[TABLE]

where $\left(\varepsilon_{i}\right)^{m}_{i=1}$ are independent, symmetric, $\left\{-1,1\right\}$ -valued random variables that are independent of $\left(X_{i}\right)^{m}_{i=1}$ .

Our goal is to apply Lemma 6 to $\mathcal{Z}=\mathcal{M}_{2,\mu,\alpha}\cap\left\{Z\in\mathcal{S}^{n}:\ \|Z\|_{F}=1\right\}$ . The following key lemma shows that matrices in $\mathcal{M}_{2,\mu,\alpha}$ have two favorable properties: They are approximately low-rank and their mass with respect to the Frobenius norm is not concentrated on the diagonal for $\mu$ is small. The first property follows directly from the fact that the negative eigenvalue is rather small, the second property requires the spectral flatness of $x_{0}$ , i.e., that $\mu$ is bounded.

Lemma 7.

Let $\alpha>0$ and $0<\mu\leq 1$ . Assume that $Z\in\mathcal{M}_{2,\mu,\alpha}$ . Then it holds that

[TABLE] 2. 2.

[TABLE]

Proof.

Let $Z\in\mathcal{M}_{2,\mu,\alpha}$ . By definition of $\mathcal{M}_{2,\mu,\alpha}$ we have that $\alpha\sum_{i=1}^{n-1}\lambda_{i}\left(Z\right)<-\lambda_{n}\left(Z\right)$ , which implies that

[TABLE]

This proves inequality (6.4).

In order to prove the second inequality note that by definiton of $\mathcal{M}_{2,\mu,\alpha}\subset\mathcal{M}_{\mu}$ we can choose $x_{0}\in\mathcal{X}_{\mu}\cap S^{n-1}$ such that there exists $t>0$ with $x_{0}x^{*}_{0}+tZ$ positive semidefinite. For this choice of $x_{0}$ we can decompose $Z$ uniquely into

[TABLE]

where $\lambda\in\mathbb{R}$ , $\langle u,x_{0}\rangle=0$ , and $Z_{2}x_{0}=0$ . We observe that

[TABLE]

We will bound the two summands separately. We begin with $\|\text{diag}\left(Z_{1}\right)\|_{HS}$ and observe that

[TABLE]

In the first inequality we used the triangle inequality and in the third line we used that $\|x_{0}\|_{\infty}\leq\mu\|x_{0}\|=\mu$ due to $x_{0}\in\mathcal{X}_{\mu}\cap S^{n-1}$ . In the fourth line we used that $|\lambda|\leq\|Z_{1}\|_{HS}$ and $\|u\|\leq\|Z_{1}\|_{HS}$ , which follows from the fact that the summands of $Z_{1}=-\lambda x_{0}x^{*}_{0}+ux^{*}_{0}+x_{0}u^{*}$ are orthogonal to each other. In the last line we again used that $\|Z_{1}\|_{HS}\leq\|Z\|_{HS}$ as $Z$ is decomposed orthogonally into $Z=Z_{1}+Z_{2}$ .

In order to bound $\|\text{diag}\left(Z_{2}\right)\|_{HS}$ we note first that $Z_{2}$ is positive semidefinite. Indeed, suppose by contradiction that $Z_{2}$ is not positive semidefinite. Then there would exist a vector $v\in\mathbb{C}^{n}$ such that $\langle v,x_{0}\rangle=0$ and $v^{*}Z_{2}v<0$ . In particular, this would imply that $v^{*}\left(x_{0}x^{*}_{0}+tZ\right)v<0$ for all $t>0$ , which is a contradiction to our choice of $x_{0}$ .

Now let $w\in\mathbb{C}^{n}$ be the normalized (i.e., $\|w\|=1$ ) eigenvector corresponding to the eigenvalue $\lambda_{n}\left(Z\right)$ . Then we obtain that

[TABLE]

where the first inequality follows from the fact that $Z_{2}$ is positive semidefinite. Using this observation we obtain that

[TABLE]

where in the fourth line we used that $-\lambda_{n}\left(Z\right)\geq\frac{1}{1+\alpha^{-1}}\|Z\|_{1}$ , which is a consequence of the first inequality of (6.6). Combining this estimate with (6.8) and (6.9) shows part (2), which finishes the proof.

∎

In analogy to [25] we bound $Q_{\mathcal{Z}}\left(2u\right)$ using the following lemma, whose proof is based on the Paley-Zygmund inequality. A key difference is that we use the Hanson-Wright inequality to control the fourth moment $\mathbb{E}|\xi^{*}A\xi|^{4}$ appropriately.

Lemma 8.

Let $A\in\mathcal{S}^{n}$ and let $\xi=\left(\xi_{1},\ldots,\xi_{n}\right)$ be a random vector with independent and identically distributed entries $\xi_{i}$ taking values in $\mathbb{C}$ such that $\mathbb{E}\xi_{i}=0$ , $\mathbb{E}|\xi_{i}|^{2}=1$ , and $\|\xi_{i}\|_{\psi_{2}}\leq K$ . Then we have that

[TABLE]

Here $C>0$ is an absolute constant.

Proof.

Note that by the Paley-Zygmund inequality (see, e.g., [12]) we have that for all $0<t\leq\mathbb{E}|\xi^{*}A\xi|^{2}$

[TABLE]

In particular, setting $t=\mathbb{E}|\xi^{*}A\xi|^{2}/2$ yields that

[TABLE]

To estimate $\mathbb{E}|\xi^{*}A\xi|^{4}$ from above we note that the triangle inequality yields that

[TABLE]

In order to estimate the first summand we will use that $\big{|}\xi^{*}A\xi-\mathbb{E}\left[\xi^{*}A\xi\right]\big{|}$ has a mixed subgaussian/subexponential tail. We can bound the tail probability using the Hanson-Wright inequality (in the version of [36]), which states that there is a numerical constant $c>0$ such that for all $t>0$ it holds that

[TABLE]

This yields that

[TABLE]

where the third line follows from a change of variables. Combining this inequality chain with (6.15) we obtain that

[TABLE]

Inserting this into (6.14) finishes the proof. ∎

In order to apply Lemma 8 we need a lower bound for $\mathbb{E}\left[|\xi^{*}A\xi|^{2}\right]$ . The next lemma computes this quantity.

Lemma 9.

Let $\xi=\left(\xi_{1},\ldots,\xi_{n}\right)$ be a random vector with independent and identically distributed entries $\xi_{i}$ taking values in $\mathbb{C}$ such that $\mathbb{E}\xi_{i}=0$ and $\mathbb{E}|\xi_{i}|^{2}=1$ . Then for all matrices $A\in\mathcal{S}^{n}$ it holds that

[TABLE]

Proof.

First, we observe that

[TABLE]

where in the third line we used that $\mathbb{E}\left[\xi_{i}\right]=0$ and that the entries of $\xi$ are independent, which implies that there are no summands where one index appears exactly three times. The first summand can be computed by

[TABLE]

where we have used that $A_{i,i}=\overline{A_{i,i}}$ for all $i\in\left[n\right]$ and $\mathbb{E}\left[|\xi_{i}|^{2}\right]=1$ . The second summand can be computed by

[TABLE]

For equation $(a)$ we used the observation that

[TABLE]

By summing up $(I)$ and $(II)$ we obtain equality (6.22).

∎

The lemmas above would allows us to find a lower bound for $Q_{\mathcal{Z}}\left(2u\right)$ in Lemma 6. We still need an upper bound for the Rademacher complexity $\mathbb{E}\left[\underset{Z\in\mathcal{Z}}{\sup}\Big{|}\sum_{i=1}^{m}\varepsilon_{i}\langle\xi_{i}\xi^{*}_{i},Z\rangle_{HS}\Big{|}\right]$ . The next lemma provides such a bound. In [28] a version of this lemma has already been presented. Nevertheless, we include a proof for completeness.

Lemma 10.

Assume that $m\geq C_{1}n$ . Let $\alpha>0$ , $0<\mu\leq 1$ and set $\mathcal{Z}:=\mathcal{M}_{1,\mu,\alpha}\cap\left\{Z\in\mathcal{S}^{n}:\ \|Z\|_{HS}=1\right\}$ . Then we have that

[TABLE]

$C_{1}$ * and $C_{2}$ are absolute constants.*

Proof.

First, we note that by Hoelder’s inequality and Lemma 7 we obtain that

[TABLE]

To bound $\mathbb{E}\left[\Big{\|}\sum^{m}_{i=1}\varepsilon_{i}\xi^{\left(i\right)}\left(\xi^{\left(i\right)}\right)^{*}\Big{\|}\right]$ let $\mathcal{N}$ be an $\frac{1}{4}$ -covering of the unit sphere $S^{n-1}\subset\mathbb{R}^{n}$ with respect to the Euclidean norm such that

[TABLE]

By [41, Lemma 5.4] we have that

[TABLE]

Fix $x\in\mathcal{N}$ and observe that

[TABLE]

where we have set $z_{i}:=\varepsilon_{i}|\langle\xi_{i},x\rangle|^{2}$ . We observe that $\mathbb{E}\left[z_{i}\right]=0$ and, moreover,

[TABLE]

where the first equality follows directly from the definition of the $\|\cdot\|_{\psi_{1}}$ -norm. 222For the definition of the $\|\cdot\|_{\psi_{1}}$ -norm see, e.g., [41, Section 5.2.4]. The first inequality can be seen using [42, Lemma 2.7.6] and the second one using [42, Lemma 3.4.2]. By the Bernstein inequality (see, e.g., [42, Theorem 2.8.1]) we obtain that

[TABLE]

where $c>0$ is some numerical constant. It follows from (6.40), (6.41), (6.43), and a union bound that

[TABLE]

where $\tilde{c}=\log 12$ . Then, whenever $m\geq\frac{\tilde{c}}{c}n$ , we obtain that

[TABLE]

In order to finish we need to estimate the two integrals. By a change of variables and [19, Lemma C.7] we obtain that

[TABLE]

Inserting this in the inequality chain above yields that

[TABLE]

Combined with inequality (6.39) this finishes the proof. ∎

Now we have gathered all the ingredients to complete the proof.

Proof of Lemma 3 and Lemma 5.

We will start by showing that $\mathbb{E}\left[|\langle\xi\xi^{*},Z\rangle_{HS}|^{2}\right]\gtrsim\beta\|Z\|^{2}_{HS}$ for all $Z\in\mathcal{M}_{2,\mu,\alpha}$ in the case of Lemma 3, or all $Z\in\mathcal{M}_{2,\mu,\alpha}\cap\mathbb{R}^{n\times n}$ in the case of Lemma 5, respectively.

We first consider the second case and assume that the condition $\mathbb{E}\left[|\xi_{i}|^{4}\right]\geq 1+\beta$ is satisfied for some $\beta>0$ . By Lemma 9 we obtain that for all $Z\in\mathcal{M}_{2,\mu,\alpha}$ under the conditions of Lemma 3

[TABLE]

Under the assumptions of Lemma 5 we observe that $\sum_{i\neq j}\text{Im}\left(Z_{i,j}\right)^{2}=0$ and $\sum_{i\neq j}\text{Re}\left(Z_{i,j}\right)^{2}=\|Z-\text{diag}\left(Z\right)\|^{2}_{HS}$ . Hence, a similar argument as before also leads to

[TABLE]

Under the first assumption we obtain by Lemma 9 that for all $Z\in\mathcal{M}_{2,\mu,\alpha}$

[TABLE]

Similarly, under the assumptions of Lemma 5 we can again use that $\sum_{i\neq j}\text{Im}\left(Z_{i,j}\right)^{2}=0$ and $\sum_{i\neq j}\text{Re}\left(Z_{i,j}\right)^{2}=\|Z-\text{diag}\left(Z\right)\|^{2}_{HS}$ to obtain by an analogous argument that

[TABLE]

The remainder of the proof will be the same for Lemma 3 and Lemma 5. By Lemma 7 we have that

[TABLE]

By the triangle inequality it follows that

[TABLE]

for $\tilde{C}=100$ . Inserting into (6.57) one obtains that

[TABLE]

Hence, we have shown in all cases that $\mathbb{E}\left[|\langle\xi\xi^{*},Z\rangle_{HS}|^{2}\right]\geq\frac{\beta}{\tilde{C}^{2}}\|Z\|^{2}_{HS}$ .

By Lemma 8 it follows that for all $Z\in\mathcal{M}_{2,\mu,\alpha}$

[TABLE]

Note that for all $Z\in\mathcal{M}_{2,\mu,\alpha}$

[TABLE]

where in the last inequality we also used that $\alpha=\frac{4}{5}$ . This shows that for all $Z\in\mathcal{M}_{2,\mu,\alpha}$ it holds that

[TABLE]

where we used that $K\gtrsim 1$ due to (2.7). Now recall that $\mathcal{Z}:=\mathcal{M}_{2,\mu,\alpha}\cap\left\{Z\in\mathcal{S}^{n}:\ \|Z\|_{HS}=1\right\}$ . Thus we have shown that

[TABLE]

where $Q_{\mathcal{Z}}\left(\cdot\right)$ is defined in (6.1). We used that $u=\frac{\sqrt{\beta}}{2\sqrt{2}\tilde{C}}$ and $C^{\prime\prime}>0$ is a constant chosen large enough. From Lemma 10 it follows that

[TABLE]

Combining this inequality with our choice of $u$ and choosing the constant in assumption (4.12) large enough it follows that

[TABLE]

Applying Lemma 6 yields that with probability at least $1-2\exp\left(-2t^{2}\right)$

[TABLE]

Setting $t=\frac{\sqrt{m}\beta^{2}}{4C^{\prime\prime}K^{8}}$ it follows that with probability at least $1-2\exp\left(\frac{-m\beta^{4}}{8(C^{\prime\prime})^{2}K^{16}}\right)$ it holds that

[TABLE]

Hence, by the definition of $\mathcal{A}$ and $\mathcal{Z}$ it follows that

[TABLE]

Due to $\alpha=\frac{4}{5}$ and Lemma 7 we have that $\|Z\|_{1}\leq\frac{9}{4}\|Z\|_{HS}$ for all $Z\in\mathcal{M}_{2,\mu,\alpha}$ . Combined with (6.75) this shows (4.13). ∎

Acknowledgements

This work has been supported by the German Science Foundation (DFG) in the context of the joint project Bilinear Compressed Sensing (KR 4512/2-1) as part of the Priority Program 1798 as well as the Emmy Noether Junior Research Group Randomized Sensing of Signals and Images (KR 4512/1-1). Furthermore, the authors want to thank Peter Jung for inspiring discussions.

Bibliography43

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] S. Bahmani and J. Romberg. A flexible convex relaxation for phase retrieval. Electron. J. Stat. , 11(2):5254–5281, 2017.
2[2] S. Bahmani and J. Romberg. Phase retrieval meets statistical learning theory: A flexible convex relaxation. In Artificial Intelligence and Statistics , pages 252–260, 2017.
3[3] S. Bahmani and J. Romberg. Anchored regression: Solving random convex equations via convex programming. Found. Comput. Math. , to appear.
4[4] E. J. Candès and X. Li. Solving quadratic equations via phaselift when there are about as many equations as unknowns. Found. Comput. Math. , 14(5):1017–1026, 2014.
5[5] E. J. Candes, X. Li, and M. Soltanolkotabi. Phase retrieval from coded diffraction patterns. Appl. Comput. Harmon. Anal. , 39(2):277–299, 2015.
6[6] E. J. Candès, X. Li, and M. Soltanolkotabi. Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory , 61(4):1985–2007, 2015.
7[7] E. J. Candes, T. Strohmer, and V. Voroninski. Phaselift: Exact and stable signal recovery from magnitude measurements via convex programming. Comm. Pure Appl. Math. , 66(8):1241–1274, 2013.
8[8] V. Chandrasekaran, B. Recht, P. A. Parrilo, and A. S. Willsky. The convex geometry of linear inverse problems. Found. Comput. Math. , 12(6):805–849, 2012.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Complex phase retrieval from subgaussian measurements

Abstract

1 Introduction

2 Background and main results

2.1 Notation

2.2 PhaseLift

2.3 Subgaussian measurements

2.4 Previous work

Theorem 1**.**

3 Main results

3.1 Complex signals and complex measurement vectors

Theorem 2**.**

Remark 1**.**

3.2 Real signals and complex measurement vectors

Theorem 3**.**

Remark 2**.**

4 Proof of main results

4.1 Proof of Theorem 2

Lemma 1**.**

Lemma 2**.**

Proof.

Lemma 3**.**

Lemma 4**.**

Proof of Theorem 2.

4.2 Proof of Theorem 3

Lemma 5**.**

5 Proof of Lemma 4

Proof.

6 Proof of Lemma 3 and Lemma 5

Lemma 6**.**

Lemma 7**.**

Proof.

Lemma 8**.**

Proof.

Lemma 9**.**

Proof.

Lemma 10**.**

Proof.

Proof of Lemma 3 and Lemma 5.

Acknowledgements

Theorem 1.

Theorem 2.

Remark 1.

Theorem 3.

Remark 2.

Lemma 1.

Lemma 2.

Lemma 3.

Lemma 4.

Lemma 5.

Lemma 6.

Lemma 7.

Lemma 8.

Lemma 9.

Lemma 10.