Denoising-based Turbo Compressed Sensing
Zhipeng Xue, Junjie Ma, and Xiaojun Yuan

TL;DR
This paper introduces Denoising-based Turbo Compressed Sensing (D-Turbo-CS), a flexible algorithm that improves sparse signal recovery by employing generic denoisers, extending Turbo-CS to more complex signal structures without prior distribution knowledge.
Contribution
The paper proposes a novel D-Turbo-CS algorithm that uses generic denoisers, enabling efficient recovery of complex signals like images and low-rank matrices without prior distribution information.
Findings
D-Turbo-CS outperforms existing algorithms in reconstruction quality.
D-Turbo-CS has faster running times compared to traditional methods.
The algorithm's dynamics are accurately described by a simple MSE evolution recursion.
Abstract
Turbo compressed sensing (Turbo-CS) is an efficient iterative algorithm for sparse signal recovery with partial orthogonal sensing matrices. In this paper, we extend the Turbo-CS algorithm to solve compressed sensing problems involving more general signal structure, including compressive image recovery and low-rank matrix recovery. A main difficulty for such an extension is that the original Turbo-CS algorithm requires prior knowledge of the signal distribution that is usually unavailable in practice. To overcome this difficulty, we propose to redesign the Turbo-CS algorithm by employing a generic denoiser that does not depend on the prior distribution and hence the name denoising-based Turbo-CS (D-Turbo-CS). We then derive the extrinsic information for a generic denoiser by following the Turbo-CS principle. Based on that, we optimize the parametric extrinsic denoisers to minimize the…
| Image Name | Lena | Boat | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Measurement rate | 5% | 10% | 20% | 30% | 50% | 70% | 5% | 10% | 20% | 30% | 50% | 70% |
| EM-GM-AMP [16] | 21.56 | 23.33 | 25.22 | 26.89 | 29.50 | 32.38 | 19.55 | 21.06 | 22.87 | 24.70 | 27.78 | 30.95 |
| LET-AMP [26] | - | - | - | 22.09 | 31.38 | 34.57 | - | - | 0.23 | 20.02 | 29.62 | 33.28 |
| LET-Turbo-CS | 22.27 | 24.32 | 26.77 | 28.57 | 31.74 | 35.48 | 19.96 | 21.86 | 24.43 | 26.51 | 30.16 | 34.22 |
| BM3D-AMP [15] | 28.44 | 31.77 | 33.90 | 34.36 | 38.55 | 39.48 | 26.15 | 28.83 | 31.67 | 33.54 | 35.46 | 39.21 |
| BM3D-Turbo-CS | 29.28 | 31.88 | 34.35 | 35.88 | 38.60 | 42.64 | 26.15 | 29.03 | 32.32 | 34.30 | 37.32 | 40.92 |
| SGK-AMP [33] | 7.67 | 8.27 | 27.92 | 29.85 | 33.17 | 35.87 | 5.35 | 5.53 | 25.49 | 28.07 | 31.42 | 34.57 |
| SGK-Turbo-CS | 7.70 | 8.35 | 29.01 | 31.30 | 34.60 | 37.85 | 5.39 | 5.56 | 26.22 | 28.85 | 32.54 | 35.90 |
| Image Name | Barbara | Fingerprint | ||||||||||
| Measurement rate | 5% | 10% | 20% | 30% | 50% | 70% | 5% | 10% | 20% | 30% | 50% | 70% |
| EM-GM-AMP [16] | 18.54 | 20.56 | 22.65 | 24.47 | 27.69 | 32.14 | 16.81 | 18.24 | 20.37 | 22.51 | 26.04 | 29.58 |
| LET-AMP [26] | - | - | - | 19.92 | 27.57 | 31.15 | - | - | - | 17.78 | 29.05 | 33.46 |
| LET-Turbo-CS | 18.87 | 20.65 | 22.86 | 24.58 | 28.07 | 32.36 | 16.03 | 18.03 | 22.09 | 24.05 | 29.83 | 34.90 |
| BM3D-AMP [15] | 26.74 | 29.52 | 32.81 | 35.21 | 38.46 | 41.66 | 18.18 | 22.75 | 26.61 | 28.59 | 32.07 | 36.44 |
| BM3D-Turbo-CS | 26.73 | 30.40 | 34.23 | 36.46 | 39.91 | 43.37 | 18.04 | 24.53 | 27.73 | 30.28 | 34.51 | 38.91 |
| SGK-AMP [33] | 5.94 | 6.35 | 25.58 | 28.20 | 32.13 | 35.78 | 4.59 | 4.76 | 20.32 | 23.98 | 28.49 | 32.39 |
| SGK-Turbo-CS | 5.88 | 6.36 | 26.43 | 29.30 | 33.65 | 37.77 | 4.58 | 4.82 | 20.46 | 24.10 | 28.38 | 32.79 |
| Image Name | Lena | Boat | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Measurement rate | 5% | 10% | 20% | 30% | 50% | 70% | 5% | 10% | 20% | 30% | 50% | 70% |
| LET-AMP [26] | 3.14 | 2.43 | 2.57 | 2.41 | 2.46 | 2.58 | 2.91 | 2.35 | 2.42 | 2.47 | 3.65 | 3.37 |
| LET-Turbo-CS | 1.40 | 1.23 | 1.23 | 1.19 | 0.93 | 0.85 | 0.97 | 1.16 | 1.21 | 1.39 | 1.63 | 1.27 |
| Image Name | Barbara | Fingerprint | ||||||||||
| Measurement rate | 5% | 10% | 20% | 30% | 50% | 70% | 5% | 10% | 20% | 30% | 50% | 70% |
| LET-AMP [26] | 3.48 | 3.55 | 3.11 | 3.86 | 3.58 | 3.24 | 4.41 | 3.61 | 3.41 | 3.30 | 3.10 | 4.08 |
| LET-Turbo-CS | 1.44 | 1.58 | 1.37 | 1.41 | 1.75 | 1.46 | 1.30 | 2.40 | 3.08 | 3.02 | 1.69 | 1.46 |
| Sensing matrices | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Image Name | Lena | Boat | ||||||||||
| Measurement rate | 30% | 50% | 70% | 30% | 50% | 70% | 30% | 50% | 70% | 30% | 50% | 70% |
| BM3D-AMP [15] | 7.68 | 7.59 | 13.83 | 34.36 | 38.55 | 39.48 | - | 9.23 | 19.11 | 33.54 | 35.46 | 39.21 |
| BM3D-Turbo-CS | 7.68 | 8.69 | 13.84 | 35.88 | 38.60 | 42.64 | - | 9.32 | 19.15 | 34.30 | 37.32 | 40.92 |
| Image Name | Barbara | Fingerprint | ||||||||||
| Measurement rate | 30% | 50% | 70% | 30% | 50% | 70% | 30% | 50% | 70% | 30% | 50% | 70% |
| BM3D-AMP [15] | 5.88 | 9.93 | 21.06 | 35.21 | 38.46 | 41.66 | - | 7.9 | 20.05 | 28.59 | 32.07 | 36.44 |
| BM3D-Turbo-CS | 5.88 | 10.18 | 21.07 | 36.46 | 39.91 | 43.37 | - | 8.39 | 20.07 | 30.28 | 34.51 | 38.91 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Image and Signal Denoising Methods · Photoacoustic and Ultrasonic Imaging
Denoising-based Turbo Compressed Sensing
Zhipeng Xue, Junjie Ma, and Xiaojun Yuan Z. Xue and X. Yuan are with the School of Information Science and Technology, ShanghaiTech University. J. Ma is with the Department of Statistics, Columbia University. The work in this paper was partially presented at the GlobalSIP in Dec. 2016; see reference [12].
Abstract
Turbo compressed sensing (Turbo-CS) is an efficient iterative algorithm for sparse signal recovery with partial orthogonal sensing matrices. In this paper, we extend the Turbo-CS algorithm to solve compressed sensing problems involving more general signal structure, including compressive image recovery and low-rank matrix recovery. A main difficulty for such an extension is that the original Turbo-CS algorithm requires prior knowledge of the signal distribution that is usually unavailable in practice. To overcome this difficulty, we propose to redesign the Turbo-CS algorithm by employing a generic denoiser that does not depend on the prior distribution and hence the name denoising-based Turbo-CS (D-Turbo-CS). We then derive the extrinsic information for a generic denoiser by following the Turbo-CS principle. Based on that, we optimize the parametric extrinsic denoisers to minimize the output mean-square error (MSE). Explicit expressions are derived for the extrinsic SURE-LET denoiser used in compressive image denoising and also for the singular value thresholding (SVT) denoiser used in low-rank matrix denoising. We find that the dynamics of D-Turbo-CS can be well described by a scaler recursion called MSE evolution, similar to the case for Turbo-CS. Numerical results demonstrate that D-Turbo-CS considerably outperforms the counterpart algorithms in both reconstruction quality and running time.
Index Terms:
Compressed sensing, message passing, orthogonal sensing matrix, denoising, MSE evolution.
I Introduction
Compressed sensing (CS) [1] is a new paradigm for sparse signal reconstruction. A common approach for compressed sensing problem is to solve a mixed -norm and -norm minimization problem via convex programming [2]. However, a convex program in general involves polynomial-time complexity, which causes a serious scalability problem for mass data applications.
Approximate algorithms have been extensively studied to reduce the computational complexity of sparse signal recovery. Existing approaches include match pursuit [3], orthogonal match pursuit [4], iterative soft thresholding [5], compressive sampling matching pursuit [6], and approximate message passing (AMP) [7]. In particular, AMP is a fast-convergence iterative algorithm based on the principle of message passing. It has been shown that, when the sensing matrix is independent and identically distributed (i.i.d.) Gaussian, AMP is asymptotically optimal as the dimension of the state space goes to infinity [8]. Also, the iterative process of AMP can be tracked through a scalar recursion called state evolution.
In many applications, compressive measurements are taken from a transformed domain, such as discrete Fourier transform (DFT), discrete cosine transform (DCT), and wavelet transform, etc. This, on one hand, can exempt us from storing the sensing matrix in implementation; on the other hand, these orthogonal transforms can be realized using fast algorithms to reduce the computational complexity. However, the AMP algorithm, when applied to orthogonal sensing, does not perform well and its simulated performance deviates away from the prediction by the state evolution.
Turbo compressed sensing (Turbo-CS) [9] solved the above discrepancy by a careful redesign of the message passing algorithm. The Turbo-CS algorithm consists of two processing modules: One module handles the linear measurements of the sparse signal based on the linear minimum mean-square error (LMMSE) principle and calculates the so-called extrinsic information to decorrelate the input and output estimation errors; the other module combines its input with the signal sparsity by following the minimum mean-square error (MMSE) principle and also calculates the extrinsic information. The two modules are executed iteratively to refine the estimates. This is similar to the decoding process of a turbo code [10], hence the name Turbo-CS. It has been shown that Turbo-CS considerably outperforms its counterparts for compressed sensing in both complexity and convergence speed.
In this paper, we extend the Turbo-CS algorithm to solve compressed sensing problems with partial orthogonal sensing matrices involving more general signal structures, such as compressive image recovery and low-rank matrix recovery. An immediate obstacle for such an extension is that the MMSE module in the Turbo-CS algorithm in [9] requires the prior knowledge of the signal distribution, while the latter is generally unavailable in the new problems under concern. To overcome this obstacle, we replace the MMSE module in Turbo-CS by a generic denoiser that does not depend on the prior distribution. We derive the extrinsic information for a generic denoiser by following the Turbo-CS principle. Interestingly, we show that the resulting extrinsic denoiser falls into the category of divergence-free denoisers in [11]. Based on that, we propose to optimize the parametric extrinsic denoisers to minimize the output mean-square error (MSE). Explicit expressions are derived for the extrinsic SURE-LET denoiser used in image denoising [13] and also for the singular value thresholding (SVT) denoiser used in low-rank matrix denoising [14].
We find the dynamics of denoising-based Turbo-CS (D-Turbo-CS) can be characterized by a scaler recursion called MSE evolution. We also study the impact of the choice of the sensing matrix on the accuracy of the MSE evolution in Turbo-CS. We show that when the signals to be recovered are i.i.d., the output error of the LMMSE module can be modelled as an additive Gaussian noise and the corresponding state evolution is accurate. However, the state evolution is not necessarily accurate when correlated signals are involved, e.g., in the case of image denoising where the neighbouring pixels of an image are usually continuous in value and so are correlated to each other. We show that this problem can be solved by an appropriate design of the sensing matrix. A simple solution is to right-multiply the sensing matrix by an extra diagonal matrix with random +1 or -1 in the diagonal. This extra diagonal matrix randomly flips the signs of the signals, and effectively decorrelates the signals.
We further compare the performance of D-Turbo-CS with the state-of-the-art algorithms in the literature. For example, denoising-based AMP (D-AMP) was studied in [15], and a number of popular image denoisers were examined therein. Also, the EM-GM-AMP algorithm proposed in [16] can be applied to the compressed image denoising problem under concern. Numerical results demonstrate that D-Turbo-CS considerably outperforms D-AMP and EM-GM-AMP in both convergence rate and recovery accuracy.
The remainder of the paper proceeds as follows. Section II takes a brief review of the Turbo-CS algorithm in [9]. Section III describes how to extend the Turbo-CS algorithm for a generic denoiser. The construction of an extrinsic denoiser is discussed in Section IV. Section V studies the MSE evolution for D-Turbo-CS. Numerical comparisons of Turbo-CS with its counterparts are presented in Section VI. Section VII concludes the paper.
II Preliminaries
II-A Compressed Sensing
Consider the following real-valued linear system:
[TABLE]
where is an unknown signal vector, is a known constant matrix, and is a white Gaussian noise vector with zero mean and covariance . Here, represents the identity matrix of an appropriate size. Our goal is to recover from the measurement . In particular, this problem is known as compressed sensing when and is sparse.
Basis Pursuit De-Noising (BPDN) is a well-known approach to the recovery of in compressed sensing, with the problem formulated as
[TABLE]
where represents the -norm, is the th entry of , and is a regularization parameter. This problem can be solved by convex programming algorithms, such as the interior point method [17] and the proximal method [18]. Interior point method has cubic computational complexity, which is too expensive for high-dimensional applications such as imaging. Proximal methods have low per-iteration complexity. However, its convergence speed is typically slow.
Message passing is a promising alternative to solve the BPDN problem in (2). To apply message passing, we first notice that (2) can be viewed as a maximum a posteriori probability (MAP) estimation problem. Specifically, we assign a prior distribution to . Then, it is easy to verify that in (2) is equivalent to:
[TABLE]
where . In [19], a factor graph was established to represent above the probability model, based on which approximate message passing (AMP) was used to iteratively solve the inference problem in (3). As the established factor graph is dense in general, directly applying message passing to the graph leads to high complexity. To reduce complexity, two approximations are introduced in AMP: First, messages from factor nodes to variable nodes are nearly Gaussian; second, messages from variable nodes to factor nodes can be calculated by using Taylor-series approximation to reduce computational cost. It was shown in [19] that the approximation error vanishes when with a fixed ratio.
The convergence of the AMP algorithm requires that the elements of the sensing matrix are sufficiently random. It was shown in [8] that AMP is asymptotically optimal when is i.i.d. Gaussian and the behavior of AMP can be characterized by a scaler recursion called state evolution.
II-B Turbo Compressed Sensing
In many applications, the sensing matrix is neither i.i.d. nor Gaussian. For example, to reduce storage and computational complexity, measurements are usually taken from an orthogonal transform domain, such as DFT or DCT. In these scenarios, The performance of AMP deteriorates and the convergence of AMP is not guaranteed. This motivates the development of the Turbo-CS algorithm [9] described below.
The block diagram of the Turbo-CS algorithm is illustrated in Fig. 1. Turbo-CS bears a structure similar to a turbo decoder [10], hence the name Turbo-CS. As illustrated in Fig. 1, the Turbo-CS algorithm consists of two modules. Module A is basically a linear minimum mean square error (LMMSE) estimator of based on the measurement and the messages from Module B. Module B performs minimum mean square error (MMSE) estimation that combines the prior distribution of and the messages from Module A. The two modules are executed iteratively to refine the estimate of . The detailed operations of Turbo-CS are presented in Algorithm 1.
We now give more details of Algorithm 1. Module A estimates based on the measurement in (1) with a priori distributed as . Given with , the posterior distribution of each is still Gaussian with posterior mean and variance given by [20]
[TABLE]
where is the th column of . Note that as the measurement is linear in , the a posteriori mean in (4a) is also called the LMMSE estimator of .
The posterior distributions cannot be used directly in message passing due to the correlation issue. Instead, we need to calculate the so-called extrinsic message [10] for each by excluding the contribution of the input message of . That is, the extrinsic distribution of each satisfies
[TABLE]
where , and “” represents equality up to a constant multiplicative factor. From (5), the extrinsic mean and variance of are respectively given in [21] as
[TABLE]
Combining (4) and (6), we obtain Lines 2 and 3 of Algorithm 1.
It is worth noting that (5) implies the independence of the input distortion and the output distortion . Further more, for Gaussian distributions, independence is equivalent to uncorrelatedness. Thus, we have
[TABLE]
where the expectation is taken over the joint probability distribution of , , and .
We now consider Module B. Recall that Module B estimates each by combining the prior distribution and the message from Module A. Note that the message from Module A is now treated as an input of Module B, denoted by . Following [9], we model each as an observation of corrupted by an additive noise:
[TABLE]
where is independent of . The a posteriori mean and variance of each for Module B are respectively given by
[TABLE]
where denotes conditional variance of given . Similar to (6), the extrinsic variance and mean of for Module B are respectively given by Lines 7 and 8 of Algorithm 1. Also, similar to (7), the extrinsic distortion is uncorrelated with the prior distortion, i.e.
[TABLE]
where the expectation is taken over the joint propability distribution of , and . Later, we will see that (10) plays an important role in the extension of Turbo-CS.
III Denoising-based Turbo CS
III-A Problem Statement
In Algorithm 1, the operation of Module B requires the knowledge of the prior distribution of . However, such prior information is difficult to acquire in many applications. Low-complexity robust denoisers, rather than the optimal MMSE denoiser, are usually employed in practice, even when the prior distribution of is available.
Turbo-CS with a generic denoiser is illustrated in Fig. 2. Compared with Fig. 1, the only difference is that Turbo-CS in Fig. 2 replaces the MMSE denoiser by a generic denoiser, defined as
[TABLE]
where represents the denoising function with being the input, being the output, and and being the parameters. Note that the choice of will be specified when a specific denoiser is involved. For brevity, we may simplify the notation to in circumstances without causing ambiguity. Also, we denote the th entry of as
[TABLE]
With the above replacement, the main challenge is how to calculate the extrinsic message of each for Module B, without the prior knowledge of the distribution . Note that Lines 7 and 8 of Algorithm 1 cannot be used any more since they hold only for the MMSE denoiser.
III-B Extrinsic Messages for a Generic Denoiser
We now describe how to calculate the extrinsic messages for a generic denoiser. Without loss of generality, denote the extrinsic output of Module B by . We call a extrinsic denoiser. Similarly to Line 8 of Algorithm 1, we construct by a linear combination of the a priori mean and the a posteriori mean:
[TABLE]
where and are coefficients to be determined. Clearly, (13) is identical to Line 8 of Algorithm 1 by letting and . Here, we require that and are chosen such that
- (i)
The extrinsic distortion is uncorrelated with the prior distortion, i.e.
[TABLE] 2. (ii)
is minimized.
From the discussions in Section II-B, the calculation of the extrinsic messages in Lines 8 and 9 satisfies the above two conditions when the MMSE denoiser is employed. Note that (14) is a relaxation of (10) since (10) implies (14) but the converse does not necessarily hold. Later we will see that this relaxation is good for many applications. What remains is to determine and satisfying conditions (i) and (ii) for a generic denoiser. This is elaborated in the following.
III-B1 Determining parameter
As mentioned in Section II-B, the input message of Module B can be modeled by (8), where the noise part is independent of . Then
[TABLE]
where (15a) follows from (8), and (15b) follows by noting . To proceed, we introduce the Stein’s lemma [22] as follows: For a normally distributed random variable , and a differentiable function such that , we have
[TABLE]
Then
[TABLE]
where denotes the partial derivative of with respect to variable and the expectation is taken over the joint probability distribution of and . In the above, (17a) follows from (8) and (13), (17c) from and , and (17e) from the Steins’s lemma by letting . Combining (14), (15b), and (17), we obtain
[TABLE]
where denotes divergence, and is the partial derivative of with respect to . Note that the approximation in (18b) becomes accurate when is large. Also, with this approximation, the calculation of does not depend on the distribution of .
By substituting (18) into (13), we obtain
[TABLE]
where the extrinsic denoiser is defined as
[TABLE]
The divergence of at is zero by noting
[TABLE]
Thus, belongs to the family of divergence-free denoisers proposed in [11].
III-B2 Determining parameter
Ideally, we want to choose parameter to satisfy condition (ii) below (14). However, the MSE is difficult to evaluate as the distribution of is unknown. To address this problem, we use the Stein’s unbiased risk estimate (SURE) [22] to approximate the MSE.
To be specific, consider the signal model
[TABLE]
where is the additive Gaussian noise draw from . The mean square error of denoiser is defined by
[TABLE]
The SURE of the MSE of is given by
[TABLE]
Compared with the MSE in (23), the SURE in (24) does not involve the distribution of . We next use SURE as a surrogate for MSE and tune the denoiser by minimizing the SURE. Recall from (8) that can be represented as . Let . Then, applying (24) to , we obtain
[TABLE]
where the second step follows from (21), and the last step from (19). Minimizing the SURE given in (25), we obtain the optimal given by
[TABLE]
III-C Denoising-based Turbo CS
We are now ready to extend Turbo-CS for a generic denoiser. We refer to the extended algorithm as Denoising-based Turbo-CS (D-Turbo-CS). The details of D-Turbo-CS are presented in Algorithm 2.
Compared with Turbo-CS, D-Turbo-CS has the same operations in Module A. But for Module B, D-Turbo-CS employs a generic denoiser, rather than the MMSE denoiser. Correspondingly, the extrinsic mean is calculated using Line 6 of Algorithm 2; the extrinsic variance is calculated in Line 7 by following Eqn. (71) in [16].
IV Construction of Extrinsic Denoisers
Various denoisers have been proposed in the literature for noise suppression. For example, the SURE-LET [13], the BM3D [23], and the dictionary learning [24] are developed for image denoising; the singular value thresholding (SVT) [25] is used for low-rank matrix denoising. In this section, we study the applications of these denoisers in D-Turbo-CS. We describe how to construct the corresponding extrinsic denoiser for any given denoiser . Based on that, we further consider optimizing the denoiser parameter .
IV-A Extrinsic SURE-LET Denoiser
We start with the SURE-LET denoiser. A SURE-LET denoiser is constructed as a linear combination of some kernel functions. The combination coefficients are determined by minimize the SURE of the MSE [13].
Specifically, a SURE-LET denoiser is constructed as
[TABLE]
where is an orthonormal transform matrix, for are kernel functions, , and . can be the Haar wavelet transform matrix or the DCT transfrom matrix.
The choice of kernel functions depends on the structure of the input signals. For example, the authors in [26] proposed the following piecewise linear kernel functions for sparse signals:
[TABLE]
where represents the th element of , is the th element of , and are constants chosen based on the noise level . The recommended values of and can be found in [26].
For SURE-LET denoiser in (27), the corresponding extrinsic denoiser is given by
[TABLE]
where (31a) is from (20), and , for .
We next determine the optimal by minimizing the SURE. From (24), the SURE of is given by
[TABLE]
where (32b) follows from (21) and (31), (32c) follows from and with .
The optimal that minimizes in (32) is given by
[TABLE]
where the th entry of and the th entry of are respectively given by
[TABLE]
with
[TABLE]
IV-B Extrinsic SVT Denoiser
In many applications, data are arranged in a matrix form. Thus, we rearrange the signal vector into a matrix with , and consider the recovery of a low-rank from the noisy observation
[TABLE]
where is the noise level, and contains i.i.d. Gaussian noise with zero mean and unit variance. Let be the rank of . We assume that is a low-rank matrix, i.e. . A popular method for low-rank matrix denoising is the so-called singular value thresholding (SVT) [14]:
[TABLE]
where is a regularization parameter, denotes the Frobenius norm, and denotes the nuclear norm. The singular value decomposition of is given by
[TABLE]
where , satisfies , and satisfies . Then, the SVT denoiser in (37) has the following closed-form expression [14]:
[TABLE]
From [14], the divergence of the SVT denoiser has a closed-form expression given by
[TABLE]
where . For an SVT denoiser, we construct the extrinsic denoiser based on (20) as
[TABLE]
where
[TABLE]
We define the MSE and the SURE of respectively as
[TABLE]
It’s clear that the divergence of is zero. From (43b), the SURE of the MSE is given by
[TABLE]
The optimal that minimizes in (44) is given by
[TABLE]
By substituting into (44), and after some straightforward manipulations, we obtain
[TABLE]
The optimal threshold that minimizes given in (46) can be obtained by solving the following optimization problem:
[TABLE]
The problem in (47) is non-convex. However, since only one parameter is involved, we can solve (47) by exhaustive search.
IV-C Other Extrinsic Denoisers
Both the SURE-LET denoiser and the SVT denoiser have analytical expressions. However, there are other denoisers that can not be expressed in a closed form. The corresponding extrinsic denoisers also have no analytical expressions. We give two examples as follows.
The first example is the dictionary learning denoiser. Dictionary learning aims to find a sparse representation for a given data set in the form of a linear combination of a set of basic elements. This set of basic elements is called a dictionary. Existing dictionary learning algorithms include K-SVD [27], iterative least square (ILS) [28], recursive least squares (RLS) [29], and the sequential generalization of -means (SGK) [30]. Based on above dictionary learning algorithms, we can construct dictionary learning denoisers by following the approach in [24]. Specifically, consider a noisy image matrix , where and are integers. We reshape into a vector , where . Also, we divide the whole image into blocks of size , and reshape each block into a vector , where is an integer satisfying . Note that is related to by where is the corresponding block extraction matrix. Then we use as the training set to train a dictionary using any of the dictionary learning algorithms mentioned above, where is an integer satisfying . The image block can be expressed approximately as
[TABLE]
where is the sparse representation of using the dictionary . Then, we update the whole image vector based on the learned dictionary and coefficients by averaging the denoised image block vectors as
[TABLE]
where is a constant depending on the input noise level. Finally, we reshape the image vector back into an image matrix.
The second example is the BM3D denoiser [23]. The denoising process of BM3D is summarized as follows. First, the image matrix is separated into image blocks of size (with ). For each image block, similar blocks are found and grouped together into a three-dimensional (3D) data array. Then, collaborative filtering is used to denoise the 3D data arrays. The filtered blocks are then returned back to their original positions. Note that BM3D achieves the state-of-the-art visual quality among all the existing image denoisers.
The above dictionary learning and BM3D denoisers have no close-form expressions, and so the divergences of these denoisers can not be calculated explicitly. Instead, we evaluate their divergences using the Monte Carlo method. Specifically, the divergence of can be estimated by
[TABLE]
where is a small constant, is a perturbation matrix with the elements i.i.d. drawn from , and with and be the th elements of and , respectively. The expectation in (50) can be approximated by sample average. It is observed in [15] that one sample is good enough for high-dimensional problems.
V Evolution Analysis of D-Turbo-CS
V-A MSE Evolution
The behavior of D-Turbo-CS can be characterized by the so-called MSE evolution. Denote the input normalized mean square error (NMSE) of Module A (or equivalently, the output NMSE of Module B) at iteration as , and the output NMSE of Module A (or equivalently, the input NMSE of Module B) at iteration as , where NMSE is defined by
[TABLE]
Then, the MSE evolution is characterized by
[TABLE]
where the (52a) follows from Line 3 of Algorithm 2, (52b) follows from the assumption in (8), the expectation in (52b) is taken over , and is initialized as . We next examine the accuracy of the above MSE evolution.
V-B * with i.i.d. Entries*
We consider the situation of with i.i.d. entries. In simulation, each in is Gaussian-Bernoulli distributed with probability density function , where is the Dirac delta function. The other settings are: the sparsity rate , the measurement rate , the signal length , and the sensing matrix is chosen as the random partial DCT defined by
[TABLE]
where is a random row selection matrix which consists of randomly selected rows from a permutation matrix, and is the DCT matrix. In simulation, the SURE-LET denoiser with the kernel functions given in (28)-(30) is employed in D-Turbo-CS and D-AMP, with the corresponding algorithms denoted by LET-Turbo-CS and LET-AMP, respectively.
As shown in Fig. 3, the MSE evolution of LET-Turbo-CS matches well with the simulation. In contrast, for LET-AMP, the state evolution deviates from the simulation. Also, LET-Turbo-CS outperforms LET-AMP111Note that, the performance of LET-AMP here is better than the original LET-AMP [26], because under the condition, LET-AMP diverges, and we replace the estimated variance in D-AMP with a more robust estimate given in [31]. considerably and performs close to MMSE-Turbo-CS in which the MMSE denoiser is employed. We also plot the QQplot of the estimation error of at iteration 10 of LET-Turbo-CS in Fig. 4. From the QQplot, we see that is close to zero-mean Gaussian, which agrees well with the assumption in (8). Later, we will see that the Gaussianity of is a good indicator of the accuracy of the MSE evolution.
V-C * with Correlated Entries*
In many applications, signals are correlated and the prior distribution is unknown. For example, the adjacent pixels of a natural image are correlated and their distributions are not available. We next study the MSE evolution of D-Turbo-CS for compressive image recovery.
In simulation, we generate signal from the image “Fingerprint” of size taken from the Javier Portilla’s dataset [32] by reshaping the image into a vector of size . The denoiser is chosen as the BM3D denoiser, and the corresponding algorithms are denoted as BM3D-Turbo-CS and BM3D-AMP. We set the measurement rate to 0.3.
With the sensing matrix given in (53), the performance of BM3D-Turbo-CS and BM3D-AMP is simulated and shown in Fig. 5. We see that the simulation results of both algorithms do not match with the MSE evolution. Also, we plot the QQplot of the estimation error in Fig. 6. We see that the distribution of is not quite Gaussian, and the mean of the distribution is not zero. This interprets the failure of the evolution prediction.
We conjecture that the reason for the degradation of the simulation performance in Fig. 5 is that the correlation in is not appropriately handled. So, we replace by
[TABLE]
where is a diagonal matrix with the random signs (1 or -1) in the diagonal. The simulation result with sensing matrix is shown in Fig. 7. We see that now, the MSE evolution of BM3D-Turbo-CS matches well with the simulation. Also, BM3D-Turbo-CS outperforms BM3D-AMP in both converge rate and recovery quality. In Fig 8, the QQplot of the estimation error of at iteration 2 of BM3D-Turbo-CS is plotted. We see that the estimation error is close to zero-mean Gaussian, similarly to the case of i.i.d. . To summarize, the sensing matrix in (53) is good for i.i.d. , while the sensing matrix in (54) is needed when the entries of are correlated.
VI Performance Comparisons
In this section, we provide numerical results of D-Turbo-CS for compressive image recovery and low-rank matrix recovery. For comparison, the recovery accuracy is measured by peak signal-to-noise ratio (PSNR):
[TABLE]
where MAX denotes the maximum possible pixel value of the image.
The stopping criterion of D-Turbo-CS is described as follows. The D-Turbo-CS algorithm stops when its output at iteration satisfies or when it is excecuted for over iterations, where and are predetermined constants.
VI-A Noiseless Image Recovery
For noiseless compressive image recovery, we consider three denoisers mentioned in Section IV: the SURE-LET denoiser, the BM3D denoiser and the dictionary learning denoiser. The corresponding algorithms of D-Turbo-CS and D-AMP are denoted by LET-Turbo-CS and LET-AMP, BM3D-Turbo-CS and BM3D-AMP, and SGK-Turbo-CS and SGK-AMP. The EM-GM-AMP algorithm in [16] is also included for comparison. The test images are chosen from the Javier Portilla’s dataset, including “Lena”, “Boat”, “Barbara” and, “Fingerprint” in Fig. 9. The settings of and are as follows: and for SURE-LET; and for BM3D; and for SGK.
In Table I, we compare D-Turbo-CS with D-AMP and EM-GM-AMP for noiseless natural image recovery with the sensing matrix given in (54). We see that D-Turbo-CS outperforms D-AMP and EM-GM-AMP for all the test images under almost all measurement rates and denoisers. To compare the reconstruction speed, we further report the reconstruction time of LET-AMP and LET-Turbo-CS in Table II. Both algorithms are run until the stopping criterion is activated. We see that the reconstruction time of LET-Turbo-CS is much less than that of LET-AMP. In Table III, we list the PSNR of reconstructed images using BM3D-AMP and BM3D-Turbo-CS for sensing matrix and . From the table, we see that the recovery quality for sensing matrix is very poor, which is consistent with the observation in Fig. 5. To summarize, D-Turbo-CS has significant advantages over D-AMP and EM-GM-AMP in compressive image recovery in both visual quality and recovery time.
VI-B Low-Rank Matrix Recovery
For low-rank matrix recovery, we use the SVT denoiser. The corresponding algorithms of D-Turbo-CS and D-AMP are denoted respectively by SVT-Turbo-CS and SVT-AMP. The low-rank matrix is generated by the multiplication of two random matrices of size and , with the elements of the two matrices independently drawn from .
The NMSE comparison of SVT-Turbo-CS and SVT-AMP under the measurement rate with sensing matrix given in (54) is shown in Fig. 10. We see that, SVT-Turbo-CS significantly outperforms SVT-AMP, and the MSE evolution of SVT-Turbo-CS agrees well with the simulation result.
VII Conclutions
In this paper, we developed the D-Turbo-CS algorithm for compressed sensing. We discussed how to construct and optimize the so-called extrinsic denoisers for D-Turbo-CS. D-Turbo-CS does not require prior knowledge of the signal distribution, and so can be adopted in many applications including compressive image recovery and low-rank matrix recovery. Numerical results show that D-Turbo-CS outperforms D-AMP and EM-GM-AMP in terms of both recovery accuracy and convergence speed when partial orthogonal sensing matrices are involved.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory , vol. 52, no. 4, pp. 1289–1306, Apr. 2006.
- 2[2] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological) , pp. 267–288, 1996.
- 3[3] S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequency dictionaries,” IEEE Trans. Signal Process. , vol. 41, no. 12, pp. 3397–3415, 1993.
- 4[4] J. A. Tropp, “Greed is good: Algorithmic results for sparse approximation,” IEEE Trans. Inf. Theory , vol. 50, no. 10, pp. 2231–2242, 2004.
- 5[5] R. D. Nowak, S. J. Wright et al. , “Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems,” IEEE J. Sel. Topics Signal Process. , vol. 1, no. 4, pp. 586–597, 2007.
- 6[6] D. Needell and J. A. Tropp, “Cosamp: Iterative signal recovery from incomplete and inaccurate samples,” Applied and Computational Harmonic Analysis , vol. 26, no. 3, pp. 301–321, 2009.
- 7[7] D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algorithms for compressed sensing,” in Proc. Nat. Acad. Sci. , vol. 106, no. 45, Nov. 2009.
- 8[8] M. Bayati and A. Montanari, “The dynamics of message passing on dense graphs, with applications to compressed sensing,” IEEE Trans. Inf. Theory , vol. 57, no. 2, pp. 764–785, Feb. 2011.
