Fourier Phase Retrieval with Extended Support Estimation via Deep Neural Network
Kyung-Su Kim, Sae-Young Chung

TL;DR
This paper introduces a deep neural network approach for sparse Fourier phase retrieval that estimates an extended support set to improve signal reconstruction accuracy with low computational complexity.
Contribution
The paper proposes a novel DNN-based method to estimate an extended support set for sparse phase retrieval, enhancing accuracy and efficiency over existing methods.
Findings
Outperforms local search-based greedy methods in accuracy.
Achieves lower computational complexity.
Demonstrates superior performance in numerical experiments.
Abstract
We consider the problem of sparse phase retrieval from Fourier transform magnitudes to recover the -sparse signal vector and its support . We exploit extended support estimate with size larger than satisfying and obtained by a trained deep neural network (DNN). To make the DNN learnable, it provides as the union of equivalent solutions of by utilizing modulo Fourier invariances. Set can be estimated with short running time via the DNN, and support can be determined from the DNN output rather than from the full index set by applying hard thresholding to . Thus, the DNN-based extended support estimation improves the reconstruction performance of the signal with a low complexity burden dependent on . Numerical results verify that the proposed scheme…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Fourier Phase Retrieval with Extended Support Estimation via Deep Neural Network
Kyung-Su Kim, Sae-Young Chung
School of Electrical Engineering, Korea Advanced Institute of Science and Technology
E-mails: kyungsukim, [email protected]
Abstract
We consider the problem of sparse phase retrieval from Fourier transform magnitudes to recover the -sparse signal vector and its support . We exploit extended support estimate with size larger than satisfying and obtained by a trained deep neural network (DNN). To make the DNN learnable, it provides as the union of equivalent solutions of by utilizing modulo Fourier invariances. Set can be estimated with short running time via the DNN, and support can be determined from the DNN output rather than from the full index set by applying hard thresholding to . Thus, the DNN-based extended support estimation improves the reconstruction performance of the signal with a low complexity burden dependent on . Numerical results verify that the proposed scheme has a superior performance with lower complexity compared to local search-based greedy sparse phase retrieval and a state-of-the-art variant of the Fienup method.
Index Terms:
Deep neural network, extended support estimation, Fourier transform, sparse phase retrieval.
I Introduction
Sparse phase retrieval from the magnitude of the Fourier transform (SPRF) [1] has been widely studied in many fields including X-ray crystallography [2], optics [3, 4], blind channel estimation [5], and computational biology [6]. It recovers -sparse111Signal vector is called -sparse if it has nonzero elements. signal vector given measurements and the squared magnitude, , of an -point discrete Fourier transform of :
[TABLE]
where , denotes the elementwise absolute value, is the support of (i.e., set of nonzero elements in ) with size , and is a noise vector.
A commonly used algorithm to solve SPRF is the greedy sparse phase retrieval (GESPAR) proposed by Shechtman et al. [7]. GESPAR performs a local search for and iteratively updates support estimate by exchanging one element in with one in , where is an estimated index set such that . Depending on the search technology, GESPAR exhibits better performance than related algorithms (e.g., sparse Fienup [8], SDP [1], and two-stage sparse phase retrieval [9]) to reconstruct . However, given that GESPAR updates only one index in the support estimate per iteration, its performance according to complexity (i.e., efficiency) can be severely degraded as the set difference between and widens, and its complexity scales with [10]. The complexity of GESPAR further increases as the signal dimension or the signal-to-noise ratio (SNR) increases, given that set approaches the full index set, , in any of these cases.
A learned deep neural network (DNN) can obtain desired solutions with high efficiency by simply performing a matrix multiplication at each layer without solving specific optimization problems. Therefore, the DNN has notably contributed to enhancing the performance of image reconstruction and denoising in SPRF [11, 12, 13]. However, this advantage has been limited to image processing, because DNNs consider image features during learning. Consequently, available research has neglected DNNs for performance improvement to recover general (synthetic) signals for SPRF. Nevertheless, verifying the high DNN performance for recovering any synthetic signal for SPRF would imply its superiority in all fields of SPRF besides image processing. On the other hand, DNNs have provided much lower complexity with similar performance to other algorithms to recover any synthetic signal in non-SPRF domains [14, 15, 16]. Hence, DNNs can improve the efficiency to recover all synthetic signals in the SPRF domain; we discussed this in detail in Section VI-A. To verify this, we propose an algorithm called phase retrieval with extended support estimation using DNN (PRED). This is a one-shot retrieval for the support by exploiting the prior information via a DNN applied in SPRF. It improves the efficiency of GESPAR to recover all synthetic and sparse signals. In Section VI-B, we demonstrated PRED scalability through intuition and principles.
As long short-term memory (LSTM) has the same structure as Bayesian learning iterations [15], the phase retrieval problem can be solved by using either the Bayesian learning framework [17] or implementing the framework as a subroutine [10]. In fact, SPRF can be solved by executing the linear inversion for sparse estimation (e.g., sparse Bayesian learning) as a subroutine [10]. Thus, LSTM-based DNNs implicitly enable to impose structural priors to estimate the support in SPRF. Therefore, we adopt a gated-feedback LSTM [15, 18] for the DNN in PRED, although other DNN architectures may be applied for SPRF.
PRED determines extended support estimate to identify (Section III). The extended support denotes an index set with size larger than sparsity and containing . Specifically, we propose a DNN framework and its training rule to generate . For the DNN to be learnable, we define a union of equivalent solutions of and train the DNN to estimate this union set instead of (Section III-A). PRED iteratively obtains from the trained DNN output and estimates as a subset of through an algorithm called three-stage signal estimation (TSE); this process makes PRED scalable as we explained it in Section VI-B. TSE extends the damped Gauss–Newton (DGN) algorithm from GESPAR by taking more than indices as input (Section III-B) [7]. In addition, PRED improves the efficiency of GESPAR to find . In fact, it simultaneously updates multiple indices in support estimate by exploiting a probability measure for , which is provided by the trained DNN output, whereas GESPAR updates one index in per iteration without utilizing the measure. Numerical results confirm that PRED outperforms GESPAR and a state-of-the-art variant of the Fienup technique, called FISTA for phase retrieval (FISTAPH) [19], with lower complexity.
II Background
II-A DGN Algorithm
Suppose that an estimate of is given as (, where is the cardinality of ). If is correct (i.e., ), SPRF can be formulated as the minimization in (2), whose solution (i.e., -sparse vector ) is an estimate of .
[TABLE]
where , for , and is the discrete Fourier transform such that can be expressed as . For brevity, set is denoted as . A submatrix of with columns indexed by is denoted by . and denote the submatrix of with rows indexed by and the subvector of with entries indexed by , respectively.
Using first-order linear approximation, in (2) can be approximated as the th element of vector , where is the matrix whose th row is and is the vector whose th element is . Then, using , the DGN method applied in SPRF (Algorithm 1) estimates as a limit point of sequence obtained in steps 2 and 3, where is a step size determined by a backtracking procedure and denotes the minimum nonnegative integer such that . The limit of sequence has been proven to be stationary [7].
II-B GESPAR
GESPAR (Algorithm 2) first determines two index sets, and , satisfying through an autocorrelation-based process. Then, it generates estimate of () such that and utilizes the DGN method with (Algorithm 1) to estimate signal and determine whether signal error is sufficiently small. GESPAR iteratively updates support estimate in step 3 using 2-opt local search, and the iterative process (steps 2–11) proceeds until either the signal error is sufficiently small or GESPAR exceeds iteration limit . More details on GESPAR can be found in [7].
III PRED Structure
To enhance the tradeoff between performance and complexity from GESPAR, the proposed PRED aims to determine extended support estimate from a trained DNN output and recover via the proposed TSE using . Suppose that is given and includes . Then, the SPRF problem can be formulated by (3), which estimates as solution .
[TABLE]
where for . Note that the original SPRF problem is equal to (3) with replaced by . Thus, by considering , the SPRF problem can be simplified such that the dimension of the target signal is reduced from to . Besides, it is easier to find such that than to identify . Hence, PRED adopts this principle to generate from a trained DNN output (Section III-A) and estimates by solving (3) given (Section III-B).
III-A DNN for Extended Support Estimation
We propose a DNN structure and a training method to obtain extended support of . Given the trivial ambiguity in SPRF (modulo Fourier invariances [9]), there exists a -sparse vector satisfying , whose support is defined by or for any integer such that . Therefore, even in the noiseless case, support of cannot be uniquely identified as given the measurement vector , and consequently it is hard to optimize the DNN for estimating . To solve this problem, we introduce union of equivalent solutions of the true support (UES), defined as , where , . Unlike true support , UES is uniquely determined by . Thus, the DNN is learnable without the ambiguity by considering its output as instead of . As the index is always included in , the DNN is trained to retrieve .
III-A1 DNN Structure and Training Objective
For sparse vector , whose support is denoted by and measurement vector given discrete Fourier transform matrix , the DNN defined by takes vector as its input and is trained to return vector such that for and for , where represents the -dimensional probability simplex. Each element of vector is a probability, with index belonging to . Thus, for any integer such that , an extended support of including can be obtained by selecting the largest elements from the ideally trained DNN output. For instance, if is 6 and support is so that , is trained to return output vector as , and is obtained by selecting its largest elements.
III-A2 DNN Training
We randomly sampled a -sparse signal vector , whose sparsity ranges from to . We set to in this study (Section V). From signal vector , a measurement vector satisfying (1) can be obtained by adding random noise vector according to the given SNR. For the distribution of pairs produced this way, the training goal can be formulated as the minimization of (4).
[TABLE]
where is the cross-entropy between vectors and of dimension , is an -dimensional vector , whose nonzero elements are and its support is given by set , is the support of , and is the training parameter to be updated for minimizing . The detailed description is shown in Appendix.
III-B Three-Stage Signal Estimation
From the trained DNN output, we can obtain extended support estimate such that . Then, it remains to estimate the true support as or by selecting indices from . This is resolved by the minimization in (3) given . We introduce the TSE in Algorithm 3, which calls the DGN method (Algorithm 1) twice and approximately solves the minimization in (3) through the following three stages: (a) temporary signal estimation from the given : ; (b) support estimation from : ; and (c) signal estimation from : , where and are the signal and support estimates, respectively.
The first stage of TSE (step 1) minimizes the cost in (a), with a temporary signal estimate supported on obtained by applying the DGN method (Algorithm 1) with such that DGN(). The second stage (step 2) retrieves support estimate as a subset of by approximating the minimization in (b) through hard thresholding (i.e., selecting indices corresponding to the largest absolute values of supported on ). Finally, TSE determines through the DGN method with to solve (c) at the third stage (step 3).
IV PRED Algorithm
The proposed PRED is detailed in Algorithm 4 and estimates as through trained DNN and by applying TSE (Algorithm 3). The trivial ambiguity of SPRF [9] implies that index can be considered an element in the true support. By selecting the largest values of the trained DNN output, PRED initializes extended support estimate , where is the size of randomly sampled from to . Note that is larger than from inequality , and the th element of the trained DNN output indicates a probability of index belonging to UES . Thus, is the set expected to include one of the equivalent solutions of in (i.e., either or ). Under premise or , PRED solves the minimization in (3) by executing TSE to estimate as () in step 2. Then, PRED terminates depending on whether current signal error obtained from the estimate is below input threshold in steps 3 and 4. If the signal error is higher than , PRED executes steps 5–7 to obtain a new extended support estimate via an update process, which is executed in step 7 by replacing the complementary set of in with multiple indices randomly selected according to discrete probability vector generated from DNN output . Vector represents the probability of each index in belonging to . By using the updated extended support , TSE estimates the signal vector and its support as () in step 2 at the next iteration. This process is iterated until either the signal error is sufficiently small or the number of iterations exceeds limit .
V Numerical Experiments and Results
We compared the performance of PRED against similar algorithms, namely, GESPAR, FISTAPH, and phase-retrieval generalized approximate message passing (PRGAMP) [10]. For a fair comparison, we applied the same stopping criterion shown in steps 3 and 4 of Algorithm 4 to GESPAR and FISTAPH. We executed FISTAPH times for and to get multiple candidate solutions and select the one with minimum error among them. The soft thresholding parameter of FISTAPH was set to , and we selected the largest elements of its signal estimate to recover support . We used uniform and Gaussian models to generate signals. In the uniform model, each nonzero element of was sampled from to excluding the interval from to . In the Gaussian model, each nonzero element of was sampled from a standard Gaussian distribution. We assumed that each entry of follows chi-squared distribution with degrees of freedom For a complex random variable , whose real and imaginary parts are i.i.d following a Gaussian distribution, . Hence, the th element of noise vector can be set to , such that for , where is determined by the SNR. Given support estimate , we use modulo Fourier invariances and define recovery success rate \mathbb{E}[\max$$(1(\alpha(\mathcal{T})=\alpha(\mathcal{S})),1(\alpha(\mathcal{T})=\beta(\mathcal{S})))] and soft recovery success rate \mathbb{E}[\max$$(|\alpha(\mathcal{T})\cap\alpha(\mathcal{S})|,|\alpha(\mathcal{T})\cap\beta(\mathcal{S})|)/k] for the support, where is the indicator function. We set input parameters in PRED and GESPAR to and iteration limit in PRED and GESPAR to and , respectively.
For the gated-feedback LSTM used in PRED [18], we set the hidden unit size, number of unfolding steps, and layer size to , , and , respectively. Further details on this network are available in [15]. The detailed description and settings for learning DNN are given in Appendix.
We evaluated each algorithm with different SNRs and dimension values of . Figs. 1(a)–(d) and Figs. 1(e)–(g) show the rate of successful support recovery and execution time per algorithm, respectively. In most of the sparsity region, PRED outperforms the other algorithms, provides lower complexity, and is more robust to noise. We can expect that PRED scales well with , as Figs. 1(a)–(d) show that PRED uniformly recovers about twice the sparsity compared to GESPAR and FISTAPH for different and SNR values.222In Fig. 1(b), the maximum satisfying support recovery rates higher than % is , , and for PRED, GESPAR, and FISTAPH, respectively. Figs. 1(e)–(f) show that the running time of PRED is less than half of that of the other methods at sparsity below .
Given that PRED and GESPAR consist of an iteration of DGN (Algorithm 1), their complexity is expressed as , where is the average complexity of DGN and is the average number of DGN executions in each algorithm. Note that the complexity of the pseudoinverse in step 2 of DGN with input of size is generally , where is set to and in GESPAR and PRED, respectively, for a constant smaller than .333Given that size of the index set for the DGN input in steps 1 and 3 in TSE (Algorithm 3) is smaller than and equal to , respectively, and its mean are smaller than and , respectively. This implies that average complexity of the DGN used in PRED and GESPAR has the same order for , and hence their complexity is mainly dependent on . Figs. 1(h)–(i) show that in PRED is less than or similar to one-third of in GESPAR. Thus, the complexity of PRED and GESPAR has the same order for and supports the results in Figs. 1(e)–(g), showing that the execution time of PRED is shorter than that of GESPAR in most of the sparsity region.
Figs. 1(j)–(l) show the performance results for the zero-mean and unit-variance Gaussian model. PRGAMP444The public software package implemented in MATLAB was used to test PRGAMP. The other methods were implemented in Python with TensorFlow. was compared only on the Gaussian model due to its structural characteristics. Even for the Gaussian model, PRED has a superior performance with lower complexity than existing algorithms including PRGAMP.555We excluded the performance result of PRGAMP in Fig. 1(j) because it is zero for the whole sparsity region.
VI Discussion
VI-A The scalability of DNN to recover synthetic signals for SPRF
To support the claim that the DNN structure can be scalable to estimate the support in SPRF, we prepared the following subsections 1) and 2). To show that the DNN is superior to other methods for solving SPRF, we prepared the following subsection 3).
VI-A1 The DNN imposes structural priors for support estimation in SRPF
We introduce the following three results (a)–(c) by referring to [10], [15], and [17]:
- (a)
In [15], it is guaranteed that a canonical LSTM cell has the same structure as the computational flow of each iteration of Bayesian learning framework (BLF). 2. (b)
In [17], it is shown that phase retrieval can be solved by using the BLF. 3. (c)
In [10], the PRGAMP algorithm has an inner loop where the GAMP algorithm, one of compressed sensing algorithms for sparse linear inversion, is used to estimate the sparse signal. Thus, in PRGAMP, the GAMP algorithm can be replaced by any compressed sensing algorithm to estimate the support in SPRF. As the sparse BLF is a compressed sensing algorithm, we can conclude that SPRF can be solved by using the BLF; this also supports the result (b).
Results (a)–(c) imply that LSTM is a generalized (i.e., learned) version of the BLF (from result (a)) and support estimation in SPRF can be done by the BLF (from results (b) and (c)). Furthermore, it is well-known that the BLF is a scalable algorithm, as it has structural priors to estimate the target support. Therefore, as the BLF has structural priors, the LSTM-based DNN implicitly enables to impose structural priors to estimate the target support in SPRF.
VI-A2 The DNN is scalable according to signal dimension n for support estimation in SRPF
Note that BLF is scalable for signal dimension and LSTM has the same structure as the BLF. Thus, LSTM can be scalable for by imposing the structure prior to estimate the target support in SPRF. Our test results shown in Figs. 1(a)–(c) implies that the LSTM-based DNN, used in the proposed PRED, estimates the true support with a high probability, irrespective of . This is because PRED uniformly recovers about the twice the sparsity with a lower complexity than other related methods, for different values of . This supports our claim that DNN (i.e., LSTM) is scalable for to estimate the target support in SPRF.
Note that in our test, the signal is sampled from two continuous probability distributions (i.e., uniform and Gaussian). This implies that there is an infinite number of combinations of pairs for measurement vector and its corresponding true support, given any fixed sparsity . Thus, it would not possible for the DNN to recover all true supports, given that the DNN should store all the infinite number of cases if the DNN did not have any inherent structure.
The test results in Figs. 1(a)–(c) show that the DNN in PRED can recover twice the sparsity, by recovering all supports with probability one, in comparison with related SPRF methods. Hence, the DNN (i.e., LSTM) does not simply store all the supports. Instead, it has an implicit structure to estimate the target support via its forward computational flow, which is like a computational flow of the BLF. This inherent structure also ensures that the DNN is scalable.
VI-A3 The DNN can outperform existing methods for support estimation in SRPF
The LSTM can be interpreted as a learned version of the BLF. It has been shown in [20] and [21] that learned versions of approximated message passing and the iterative shrinkage thresholding algorithm outperform their counterparts for estimating the target support. This indicates that LSTM (i.e., the learned version of the BLF) can outperform the BLF for estimating the support in SPRF, as demonstrated in [15] though by solving a problem different from SPRF. This implies from the results (b) and (c) in Section VI-A1 that DNN (i.e., LSTM) can outperform existing support estimators for SPRF. We demonstrated in Section V that the proposed PRED outperforms other SPRF methods by using the LSTM as the DNN architecture for PRED, supporting our claim.
VI-B Demonstration of PRED scalability through intuition and principles
PRED has the following two main steps (1) and (2): (1) Extended support estimation from DNN output and (2) support estimation from the extended support estimate via the TSE algorithm (Algorithm 3). For step (1), the DNN provides a set (the extended support estimate) including the true support with high probability, as the DNN (i.e., LSTM) has the implicit structure to estimate the support as discussed in Section VI-A. Step (1) has a low complexity as the extended support estimation using DNN is performed simply via a matrix multiplication at each DNN layer without solving specific optimization problems. Thus, the complexity (performance) of PRED is mainly dependent on step (2). As we have shown in the last paragraph of Section V, the complexity of step (2) is O(), where b is the size of the extended support estimate obtained from step (1). Thus, the maximum of is where is the sparsity. Hence, the complexity of PRED is the order of (i.e., O()) and does not depend on signal dimension . Disregarding step (1), the complexity of PRED is O(), as the extended support estimate in step (2) is set to the whole index set. Therefore, PRED is scalable (the complexity is not affected by but by ) by making the TSE algorithm search the support not in the whole index set, but in the extended support estimate obtained from the DNN. This justifies the combination of the DNN with existing SPRF algorithms, as discussed in Section VII.
VII Conclusion
Although a DNN cannot accurately estimate the support, it is efficient to estimate the set containing it [15]. On the other hand, the optimization-based approach is less efficient at finding the support from a full set of indices, but is highly accurate from a relatively small set including the support. We leverage the advantages of both approaches to perform DNN-based extended support estimation and first show that this approach, called PRED, outperforms existing algorithms in recovering common sparse signals for SPRF.
DNN Training
-A Description of the proposed algorithm for training DNN
Algorithm 5 describes the proposed training method for the DNN . It considers noisy training data for the DNN to estimate the UES, and the case when sparsity of in the test data is unknown, with minimum and maximum bounds given by and , respectively, to generate the training data. In the algorithm, is the number of epochs, is the size of batch data, is the number of batches per epoch, and v_{\textup{SNR{}_{\textup{dB}}}} is the SNR in decibels (dB), which is expected to be . For each epoch, steps 3–9 generate training data , where and are the signal and measurement vectors, respectively. Specifically, in step 5, signal vector , whose sparsity is uniformly sampled between and , is sampled from the conditional probability of the signal vector given sparsity , defined as
[TABLE]
where is the distribution of and is the indicator function. Measurement vector in step 9 is given by plus noise vector such that 10\log_{10}\,(\sum_{j}z_{i}[j]/\sum_{j}w_{i}[j])\geq v_{\textup{SNR{}_{\textup{dB}}}}. The training goal is to minimize cost in step 13, with being an -dimensional vector , whose nonzero elements are and support is given by set , and is the cross-entropy between vectors and of dimension . Training parameter is updated to minimize in steps 14 and 15 through with its learning rate .
-B Setting environment for the experiments in Section V
To train the network given by , we used Algorithm 5 by setting input to and fixing v_{\textup{SNR{}_{\textup{dB}}}} to the SNR. In addition, we used RMSprop optimization with learning rate of for epochs and for epochs from to () to update the gradient.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] K. Jaganathan, S. Oymak, and B. Hassibi, “Recovery of sparse 1-D signals from the magnitudes of their Fourier transform,” in Proceedings of IEEE International Symposium on Information Theory , 2012, pp. 1473–1477.
- 2[2] R. P. Millane, “Phase retrieval in crystallography and optics,” Journal of the Optical Society of America A , vol. 7, no. 3, pp. 394–411, 1990.
- 3[3] Y. Shechtman, Y. C. Eldar, O. Cohen, H. N. Chapman, J. Miao, and M. Segev, “Phase retrieval with application to optical imaging: A contemporary overview,” IEEE Signal Processing Magazine , vol. 32, no. 3, pp. 87–109, 2015.
- 4[4] V. Y. Katkovnik and K. Egiazarian, “Sparse superresolution phase retrieval from phase-coded noisy intensity patterns,” Optical Engineering , vol. 56, no. 9, p. 094103, 2017.
- 5[5] B. Baykal, “Blind channel estimation via combining autocorrelation and blind phase estimation,” IEEE Transactions on Circuits and Systems I: Regular Papers , vol. 51, no. 6, pp. 1125–1131, 2004.
- 6[6] M. Stefik, “Inferring DNA structures from segmentation data,” Artificial Intelligence , vol. 11, no. 1-2, pp. 85–114, 1978.
- 7[7] Y. Shechtman, A. Beck, and Y. C. Eldar, “GESPAR: Efficient phase retrieval of sparse signals,” IEEE Transactions on Signal Processing , vol. 62, no. 4, pp. 928–938, 2014.
- 8[8] S. Mukherjee and C. S. Seelamantula, “An iterative algorithm for phase retrieval with sparsity constraints: Application to frequency domain optical coherence tomography,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing , 2012, pp. 553–556.
