Denoising-based Turbo Compressed Sensing

Zhipeng Xue; Junjie Ma; and Xiaojun Yuan

arXiv:1703.08756·cs.IT·March 28, 2017

Denoising-based Turbo Compressed Sensing

Zhipeng Xue, Junjie Ma, and Xiaojun Yuan

PDF

Open Access

TL;DR

This paper introduces Denoising-based Turbo Compressed Sensing (D-Turbo-CS), a flexible algorithm that improves sparse signal recovery by employing generic denoisers, extending Turbo-CS to more complex signal structures without prior distribution knowledge.

Contribution

The paper proposes a novel D-Turbo-CS algorithm that uses generic denoisers, enabling efficient recovery of complex signals like images and low-rank matrices without prior distribution information.

Findings

01

D-Turbo-CS outperforms existing algorithms in reconstruction quality.

02

D-Turbo-CS has faster running times compared to traditional methods.

03

The algorithm's dynamics are accurately described by a simple MSE evolution recursion.

Abstract

Turbo compressed sensing (Turbo-CS) is an efficient iterative algorithm for sparse signal recovery with partial orthogonal sensing matrices. In this paper, we extend the Turbo-CS algorithm to solve compressed sensing problems involving more general signal structure, including compressive image recovery and low-rank matrix recovery. A main difficulty for such an extension is that the original Turbo-CS algorithm requires prior knowledge of the signal distribution that is usually unavailable in practice. To overcome this difficulty, we propose to redesign the Turbo-CS algorithm by employing a generic denoiser that does not depend on the prior distribution and hence the name denoising-based Turbo-CS (D-Turbo-CS). We then derive the extrinsic information for a generic denoiser by following the Turbo-CS principle. Based on that, we optimize the parametric extrinsic denoisers to minimize the…

Tables3

Table 1. TABLE I : The PSNR of the reconstructed images with sensing matrix 𝑨 2 subscript 𝑨 2 \bm{A}_{2} .

Image Name	Lena						Boat
Measurement rate	5%	10%	20%	30%	50%	70%	5%	10%	20%	30%	50%	70%
EM-GM-AMP [16]	21.56	23.33	25.22	26.89	29.50	32.38	19.55	21.06	22.87	24.70	27.78	30.95
LET-AMP [26]	-	-	-	22.09	31.38	34.57	-	-	0.23	20.02	29.62	33.28
LET-Turbo-CS	22.27	24.32	26.77	28.57	31.74	35.48	19.96	21.86	24.43	26.51	30.16	34.22
BM3D-AMP [15]	28.44	31.77	33.90	34.36	38.55	39.48	26.15	28.83	31.67	33.54	35.46	39.21
BM3D-Turbo-CS	29.28	31.88	34.35	35.88	38.60	42.64	26.15	29.03	32.32	34.30	37.32	40.92
SGK-AMP [33]	7.67	8.27	27.92	29.85	33.17	35.87	5.35	5.53	25.49	28.07	31.42	34.57
SGK-Turbo-CS	7.70	8.35	29.01	31.30	34.60	37.85	5.39	5.56	26.22	28.85	32.54	35.90
Image Name	Barbara						Fingerprint
Measurement rate	5%	10%	20%	30%	50%	70%	5%	10%	20%	30%	50%	70%
EM-GM-AMP [16]	18.54	20.56	22.65	24.47	27.69	32.14	16.81	18.24	20.37	22.51	26.04	29.58
LET-AMP [26]	-	-	-	19.92	27.57	31.15	-	-	-	17.78	29.05	33.46
LET-Turbo-CS	18.87	20.65	22.86	24.58	28.07	32.36	16.03	18.03	22.09	24.05	29.83	34.90
BM3D-AMP [15]	26.74	29.52	32.81	35.21	38.46	41.66	18.18	22.75	26.61	28.59	32.07	36.44
BM3D-Turbo-CS	26.73	30.40	34.23	36.46	39.91	43.37	18.04	24.53	27.73	30.28	34.51	38.91
SGK-AMP [33]	5.94	6.35	25.58	28.20	32.13	35.78	4.59	4.76	20.32	23.98	28.49	32.39
SGK-Turbo-CS	5.88	6.36	26.43	29.30	33.65	37.77	4.58	4.82	20.46	24.10	28.38	32.79

Table 2. TABLE II : The recovery time of different images for sensing matrix 𝑨 2 subscript 𝑨 2 \bm{A}_{2} . The unit of time is second.

Image Name	Lena						Boat
Measurement rate	5%	10%	20%	30%	50%	70%	5%	10%	20%	30%	50%	70%
LET-AMP [26]	3.14	2.43	2.57	2.41	2.46	2.58	2.91	2.35	2.42	2.47	3.65	3.37
LET-Turbo-CS	1.40	1.23	1.23	1.19	0.93	0.85	0.97	1.16	1.21	1.39	1.63	1.27
Image Name	Barbara						Fingerprint
Measurement rate	5%	10%	20%	30%	50%	70%	5%	10%	20%	30%	50%	70%
LET-AMP [26]	3.48	3.55	3.11	3.86	3.58	3.24	4.41	3.61	3.41	3.30	3.10	4.08
LET-Turbo-CS	1.44	1.58	1.37	1.41	1.75	1.46	1.30	2.40	3.08	3.02	1.69	1.46

Table 3. TABLE III : The PSNR of the reconstructed images for different sensing matrices.

Sensing matrices	$𝑨_{1}$			$𝑨_{2}$			$𝑨_{1}$			$𝑨_{2}$
Image Name	Lena						Boat
Measurement rate	30%	50%	70%	30%	50%	70%	30%	50%	70%	30%	50%	70%
BM3D-AMP [15]	7.68	7.59	13.83	34.36	38.55	39.48	-	9.23	19.11	33.54	35.46	39.21
BM3D-Turbo-CS	7.68	8.69	13.84	35.88	38.60	42.64	-	9.32	19.15	34.30	37.32	40.92
Image Name	Barbara						Fingerprint
Measurement rate	30%	50%	70%	30%	50%	70%	30%	50%	70%	30%	50%	70%
BM3D-AMP [15]	5.88	9.93	21.06	35.21	38.46	41.66	-	7.9	20.05	28.59	32.07	36.44
BM3D-Turbo-CS	5.88	10.18	21.07	36.46	39.91	43.37	-	8.39	20.07	30.28	34.51	38.91

Equations136

y = A x + n

y = A x + n

\hat{x} = x \in R^{N} arg min \frac{1}{2} ∥ y - A x ∥_{2}^{2} + λ ∥ x ∥_{1} .

\hat{x} = x \in R^{N} arg min \frac{1}{2} ∥ y - A x ∥_{2}^{2} + λ ∥ x ∥_{1} .

\hat{x} = x \in R^{n} arg max p (x ∣ y)

\hat{x} = x \in R^{n} arg max p (x ∣ y)

x_{A, i}^{p os t}

x_{A, i}^{p os t}

v_{A}^{p os t}

N_{x_{i}} (x_{A, i}^{p r i}, v_{A, i}^{p r i}) N_{x_{i}} (x_{A, i}^{e x t}, v_{A, i}^{e x t}) ≐ N_{x_{i}} (x_{A, i}^{p os t}, v_{A, i}^{p os t}),

N_{x_{i}} (x_{A, i}^{p r i}, v_{A, i}^{p r i}) N_{x_{i}} (x_{A, i}^{e x t}, v_{A, i}^{e x t}) ≐ N_{x_{i}} (x_{A, i}^{p os t}, v_{A, i}^{p os t}),

x_{A, i}^{e x t}

x_{A, i}^{e x t}

v_{A}^{e x t}

E [(x_{i} - x_{A, i}^{p r i}) (x_{i} - x_{A, i}^{e x t})] = 0,

E [(x_{i} - x_{A, i}^{p r i}) (x_{i} - x_{A, i}^{e x t})] = 0,

x_{B, i}^{p r i} = x_{i} + n_{B, i}^{p r i}

x_{B, i}^{p r i} = x_{i} + n_{B, i}^{p r i}

x_{B, i}^{p os t} v_{B}^{p os t} = E [x_{i} ∣ x_{B, i}^{p r i}] = \frac{1}{n} i = 1 \sum n var [x_{i} ∣ x_{B, i}^{p r i}],

x_{B, i}^{p os t} v_{B}^{p os t} = E [x_{i} ∣ x_{B, i}^{p r i}] = \frac{1}{n} i = 1 \sum n var [x_{i} ∣ x_{B, i}^{p r i}],

E [(x_{i} - x_{B, i}^{p r i}) (x_{i} - x_{B, i}^{e x t})] = 0,

E [(x_{i} - x_{B, i}^{p r i}) (x_{i} - x_{B, i}^{e x t})] = 0,

x_{B}^{p os t} = D (x_{B}^{p r i}; v_{B}^{p r i}, θ),

x_{B}^{p os t} = D (x_{B}^{p r i}; v_{B}^{p r i}, θ),

x_{B, i}^{p os t} = D_{i} (x_{B}^{p r i}) .

x_{B, i}^{p os t} = D_{i} (x_{B}^{p r i}) .

x_{B}^{e x t} = D^{e x t} (x_{B}^{p r i}) = c (x_{B}^{p os t} - α x_{B}^{p r i})

x_{B}^{e x t} = D^{e x t} (x_{B}^{p r i}) = c (x_{B}^{p os t} - α x_{B}^{p r i})

E [(x - x_{B}^{p r i})^{T} (x - x_{B}^{e x t})] = 0;

E [(x - x_{B}^{p r i})^{T} (x - x_{B}^{e x t})] = 0;

E [(x - x_{B}^{p r i})^{T} (x - x_{B}^{e x t})]

E [(x - x_{B}^{p r i})^{T} (x - x_{B}^{e x t})]

= i = 1 \sum n E [n_{B, i}^{p r i} x_{B, i}^{e x t}]

σ_{y}^{2} E [h^{'} (y)] = E [(y - μ_{y}) h (y)] .

σ_{y}^{2} E [h^{'} (y)] = E [(y - μ_{y}) h (y)] .

E [n_{B, i}^{p r i} x_{B, i}^{e x t}] =

E [n_{B, i}^{p r i} x_{B, i}^{e x t}] =

=

- c α E [n_{B, i}^{p r i} n_{B, i}^{p r i}]

=

=

=

α

α

\approx \frac{1}{n} i = 1 \sum n D_{i}^{'} (x_{B}^{p r i})

= \frac{1}{n} div {D (x_{B}^{p r i})},

x_{B}^{e x t}

x_{B}^{e x t}

= c (D (x_{B}^{p r i}) - \frac{1}{n} div {D (x_{B}^{p r i})} x_{B}^{p r i})

= D^{e x t} (x_{B}^{p r i}),

D^{e x t} (r) = c (D (r) - \frac{1}{n} div {D (x_{B}^{p r i})} r) .

D^{e x t} (r) = c (D (r) - \frac{1}{n} div {D (x_{B}^{p r i})} r) .

div {D^{e x t} (x_{B}^{p r i})} = c (div {D (x_{B}^{p r i})} - div {D (x_{B}^{p r i})}) = 0.

div {D^{e x t} (x_{B}^{p r i})} = c (div {D (x_{B}^{p r i})} - div {D (x_{B}^{p r i})}) = 0.

r = x + τ n,

r = x + τ n,

MSE = \frac{1}{n} E [∥ D (r) - x ∥^{2}] .

MSE = \frac{1}{n} E [∥ D (r) - x ∥^{2}] .

MSE = \frac{1}{n} ∥ D (r) - r ∥^{2} + \frac{2 τ ^{2}}{n} div {D (r)} - τ^{2} .

MSE = \frac{1}{n} ∥ D (r) - r ∥^{2} + \frac{2 τ ^{2}}{n} div {D (r)} - τ^{2} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Image and Signal Denoising Methods · Photoacoustic and Ultrasonic Imaging

Full text

Denoising-based Turbo Compressed Sensing

Zhipeng Xue, Junjie Ma, and Xiaojun Yuan Z. Xue and X. Yuan are with the School of Information Science and Technology, ShanghaiTech University. J. Ma is with the Department of Statistics, Columbia University. The work in this paper was partially presented at the GlobalSIP in Dec. 2016; see reference [12].

Abstract

Turbo compressed sensing (Turbo-CS) is an efficient iterative algorithm for sparse signal recovery with partial orthogonal sensing matrices. In this paper, we extend the Turbo-CS algorithm to solve compressed sensing problems involving more general signal structure, including compressive image recovery and low-rank matrix recovery. A main difficulty for such an extension is that the original Turbo-CS algorithm requires prior knowledge of the signal distribution that is usually unavailable in practice. To overcome this difficulty, we propose to redesign the Turbo-CS algorithm by employing a generic denoiser that does not depend on the prior distribution and hence the name denoising-based Turbo-CS (D-Turbo-CS). We then derive the extrinsic information for a generic denoiser by following the Turbo-CS principle. Based on that, we optimize the parametric extrinsic denoisers to minimize the output mean-square error (MSE). Explicit expressions are derived for the extrinsic SURE-LET denoiser used in compressive image denoising and also for the singular value thresholding (SVT) denoiser used in low-rank matrix denoising. We find that the dynamics of D-Turbo-CS can be well described by a scaler recursion called MSE evolution, similar to the case for Turbo-CS. Numerical results demonstrate that D-Turbo-CS considerably outperforms the counterpart algorithms in both reconstruction quality and running time.

Index Terms:

Compressed sensing, message passing, orthogonal sensing matrix, denoising, MSE evolution.

I Introduction

Compressed sensing (CS) [1] is a new paradigm for sparse signal reconstruction. A common approach for compressed sensing problem is to solve a mixed $l_{1}$ -norm and $l_{2}$ -norm minimization problem via convex programming [2]. However, a convex program in general involves polynomial-time complexity, which causes a serious scalability problem for mass data applications.

Approximate algorithms have been extensively studied to reduce the computational complexity of sparse signal recovery. Existing approaches include match pursuit [3], orthogonal match pursuit [4], iterative soft thresholding [5], compressive sampling matching pursuit [6], and approximate message passing (AMP) [7]. In particular, AMP is a fast-convergence iterative algorithm based on the principle of message passing. It has been shown that, when the sensing matrix is independent and identically distributed (i.i.d.) Gaussian, AMP is asymptotically optimal as the dimension of the state space goes to infinity [8]. Also, the iterative process of AMP can be tracked through a scalar recursion called state evolution.

In many applications, compressive measurements are taken from a transformed domain, such as discrete Fourier transform (DFT), discrete cosine transform (DCT), and wavelet transform, etc. This, on one hand, can exempt us from storing the sensing matrix in implementation; on the other hand, these orthogonal transforms can be realized using fast algorithms to reduce the computational complexity. However, the AMP algorithm, when applied to orthogonal sensing, does not perform well and its simulated performance deviates away from the prediction by the state evolution.

Turbo compressed sensing (Turbo-CS) [9] solved the above discrepancy by a careful redesign of the message passing algorithm. The Turbo-CS algorithm consists of two processing modules: One module handles the linear measurements of the sparse signal based on the linear minimum mean-square error (LMMSE) principle and calculates the so-called extrinsic information to decorrelate the input and output estimation errors; the other module combines its input with the signal sparsity by following the minimum mean-square error (MMSE) principle and also calculates the extrinsic information. The two modules are executed iteratively to refine the estimates. This is similar to the decoding process of a turbo code [10], hence the name Turbo-CS. It has been shown that Turbo-CS considerably outperforms its counterparts for compressed sensing in both complexity and convergence speed.

In this paper, we extend the Turbo-CS algorithm to solve compressed sensing problems with partial orthogonal sensing matrices involving more general signal structures, such as compressive image recovery and low-rank matrix recovery. An immediate obstacle for such an extension is that the MMSE module in the Turbo-CS algorithm in [9] requires the prior knowledge of the signal distribution, while the latter is generally unavailable in the new problems under concern. To overcome this obstacle, we replace the MMSE module in Turbo-CS by a generic denoiser that does not depend on the prior distribution. We derive the extrinsic information for a generic denoiser by following the Turbo-CS principle. Interestingly, we show that the resulting extrinsic denoiser falls into the category of divergence-free denoisers in [11]. Based on that, we propose to optimize the parametric extrinsic denoisers to minimize the output mean-square error (MSE). Explicit expressions are derived for the extrinsic SURE-LET denoiser used in image denoising [13] and also for the singular value thresholding (SVT) denoiser used in low-rank matrix denoising [14].

We find the dynamics of denoising-based Turbo-CS (D-Turbo-CS) can be characterized by a scaler recursion called MSE evolution. We also study the impact of the choice of the sensing matrix on the accuracy of the MSE evolution in Turbo-CS. We show that when the signals to be recovered are i.i.d., the output error of the LMMSE module can be modelled as an additive Gaussian noise and the corresponding state evolution is accurate. However, the state evolution is not necessarily accurate when correlated signals are involved, e.g., in the case of image denoising where the neighbouring pixels of an image are usually continuous in value and so are correlated to each other. We show that this problem can be solved by an appropriate design of the sensing matrix. A simple solution is to right-multiply the sensing matrix by an extra diagonal matrix with random +1 or -1 in the diagonal. This extra diagonal matrix randomly flips the signs of the signals, and effectively decorrelates the signals.

We further compare the performance of D-Turbo-CS with the state-of-the-art algorithms in the literature. For example, denoising-based AMP (D-AMP) was studied in [15], and a number of popular image denoisers were examined therein. Also, the EM-GM-AMP algorithm proposed in [16] can be applied to the compressed image denoising problem under concern. Numerical results demonstrate that D-Turbo-CS considerably outperforms D-AMP and EM-GM-AMP in both convergence rate and recovery accuracy.

The remainder of the paper proceeds as follows. Section II takes a brief review of the Turbo-CS algorithm in [9]. Section III describes how to extend the Turbo-CS algorithm for a generic denoiser. The construction of an extrinsic denoiser is discussed in Section IV. Section V studies the MSE evolution for D-Turbo-CS. Numerical comparisons of Turbo-CS with its counterparts are presented in Section VI. Section VII concludes the paper.

II Preliminaries

II-A Compressed Sensing

Consider the following real-valued linear system:

[TABLE]

where $\bm{x}\in\mathbb{R}^{n}$ is an unknown signal vector, $\bm{A}\in\mathbb{R}^{m\times n}$ is a known constant matrix, and $\bm{n}$ is a white Gaussian noise vector with zero mean and covariance $\sigma^{2}\bm{I}$ . Here, $\bm{I}$ represents the identity matrix of an appropriate size. Our goal is to recover $\bm{x}$ from the measurement $\bm{y}$ . In particular, this problem is known as compressed sensing when $m<n$ and $\bm{x}$ is sparse.

Basis Pursuit De-Noising (BPDN) is a well-known approach to the recovery of $\bm{x}$ in compressed sensing, with the problem formulated as

[TABLE]

where $\|\bm{x}\|_{p}=(\sum_{n}|x_{n}|^{p})^{1/p}$ represents the $l_{p}$ -norm, $x_{n}$ is the $n$ th entry of $\bm{x}$ , and $\lambda$ is a regularization parameter. This problem can be solved by convex programming algorithms, such as the interior point method [17] and the proximal method [18]. Interior point method has cubic computational complexity, which is too expensive for high-dimensional applications such as imaging. Proximal methods have low per-iteration complexity. However, its convergence speed is typically slow.

Message passing is a promising alternative to solve the BPDN problem in (2). To apply message passing, we first notice that (2) can be viewed as a maximum a posteriori probability (MAP) estimation problem. Specifically, we assign a prior distribution $p(\bm{x})\propto\exp(\frac{-\lambda\|\bm{x}\|_{1}}{\sigma^{2}})$ to $\bm{x}$ . Then, it is easy to verify that $\hat{\bm{x}}$ in (2) is equivalent to:

[TABLE]

where $p(\bm{x}|\bm{y})=p(\bm{y}|\bm{x})p(\bm{x})/p(\bm{y})$ . In [19], a factor graph was established to represent above the probability model, based on which approximate message passing (AMP) was used to iteratively solve the inference problem in (3). As the established factor graph is dense in general, directly applying message passing to the graph leads to high complexity. To reduce complexity, two approximations are introduced in AMP: First, messages from factor nodes to variable nodes are nearly Gaussian; second, messages from variable nodes to factor nodes can be calculated by using Taylor-series approximation to reduce computational cost. It was shown in [19] that the approximation error vanishes when $m,n\rightarrow\infty$ with a fixed ratio.

The convergence of the AMP algorithm requires that the elements of the sensing matrix $\bm{A}$ are sufficiently random. It was shown in [8] that AMP is asymptotically optimal when $\bm{A}$ is i.i.d. Gaussian and the behavior of AMP can be characterized by a scaler recursion called state evolution.

II-B Turbo Compressed Sensing

In many applications, the sensing matrix $\bm{A}$ is neither i.i.d. nor Gaussian. For example, to reduce storage and computational complexity, measurements are usually taken from an orthogonal transform domain, such as DFT or DCT. In these scenarios, The performance of AMP deteriorates and the convergence of AMP is not guaranteed. This motivates the development of the Turbo-CS algorithm [9] described below.

The block diagram of the Turbo-CS algorithm is illustrated in Fig. 1. Turbo-CS bears a structure similar to a turbo decoder [10], hence the name Turbo-CS. As illustrated in Fig. 1, the Turbo-CS algorithm consists of two modules. Module A is basically a linear minimum mean square error (LMMSE) estimator of $\bm{x}$ based on the measurement $\bm{y}$ and the messages from Module B. Module B performs minimum mean square error (MMSE) estimation that combines the prior distribution of $\bm{x}$ and the messages from Module A. The two modules are executed iteratively to refine the estimate of $\bm{x}$ . The detailed operations of Turbo-CS are presented in Algorithm 1.

We now give more details of Algorithm 1. Module A estimates $\bm{x}$ based on the measurement $\bm{y}$ in (1) with $\bm{x}$ a priori distributed as $\bm{x}\sim\mathcal{N}(\bm{x}_{A}^{pri},v_{A}^{pri}\bm{I})$ . Given $\bm{y}$ with $\bm{x}\sim\mathcal{N}(\bm{x}_{A}^{pri},v_{A}^{pri}\bm{I})$ , the posterior distribution of each $x_{i}$ is still Gaussian with posterior mean and variance given by [20]

[TABLE]

where $\bm{a}_{i}$ is the $i$ th column of $\bm{A}$ . Note that as the measurement $\bm{y}$ is linear in $\bm{x}$ , the a posteriori mean in (4a) is also called the LMMSE estimator of $x_{i}$ .

The posterior distributions cannot be used directly in message passing due to the correlation issue. Instead, we need to calculate the so-called extrinsic message [10] for each $x_{i}$ by excluding the contribution of the input message of $x_{i}$ . That is, the extrinsic distribution of each $x_{i}$ satisfies

[TABLE]

where $\mathcal{N}_{x}(m,v)=\frac{1}{\sqrt{2\pi v}}\exp(-\frac{1}{2v}(x-m)^{2})$ , and “ $\doteq$ ” represents equality up to a constant multiplicative factor. From (5), the extrinsic mean and variance of $x_{i}$ are respectively given in [21] as

[TABLE]

Combining (4) and (6), we obtain Lines 2 and 3 of Algorithm 1.

It is worth noting that (5) implies the independence of the input distortion $x_{A,i}^{pri}-x_{i}$ and the output distortion $x_{A,i}^{ext}-x_{i}$ . Further more, for Gaussian distributions, independence is equivalent to uncorrelatedness. Thus, we have

[TABLE]

where the expectation is taken over the joint probability distribution of $x_{i}$ , $x_{A,i}^{pri}$ , and $x_{A,i}^{ext}$ .

We now consider Module B. Recall that Module B estimates each $x_{i}$ by combining the prior distribution $x_{i}\sim p(x_{i})$ and the message from Module A. Note that the message $x_{A,i}^{ext}$ from Module A is now treated as an input of Module B, denoted by $x_{B,i}^{pri}$ . Following [9], we model each $x_{B,i}^{pri}$ as an observation of $x_{i}$ corrupted by an additive noise:

[TABLE]

where $n_{B,i}^{pri}\sim\mathcal{N}(0,v_{B}^{pri})$ is independent of $x_{i}$ . The a posteriori mean and variance of each $x_{i}$ for Module B are respectively given by

[TABLE]

where $\text{var}[x|y]$ denotes conditional variance of $x$ given $y$ . Similar to (6), the extrinsic variance and mean of $\bm{x}$ for Module B are respectively given by Lines 7 and 8 of Algorithm 1. Also, similar to (7), the extrinsic distortion is uncorrelated with the prior distortion, i.e.

[TABLE]

where the expectation is taken over the joint propability distribution of $x_{i}$ , $x_{B,i}^{pri}$ and $x_{B,i}^{ext}$ . Later, we will see that (10) plays an important role in the extension of Turbo-CS.

III Denoising-based Turbo CS

III-A Problem Statement

In Algorithm 1, the operation of Module B requires the knowledge of the prior distribution of $\bm{x}$ . However, such prior information is difficult to acquire in many applications. Low-complexity robust denoisers, rather than the optimal MMSE denoiser, are usually employed in practice, even when the prior distribution of $\bm{x}$ is available.

Turbo-CS with a generic denoiser is illustrated in Fig. 2. Compared with Fig. 1, the only difference is that Turbo-CS in Fig. 2 replaces the MMSE denoiser by a generic denoiser, defined as

[TABLE]

where $\bm{D}(\cdot)$ represents the denoising function with $\bm{x}_{B}^{pri}$ being the input, $\bm{x}_{B}^{post}$ being the output, and $v_{B}^{pri}$ and $\bm{\theta}$ being the parameters. Note that the choice of $\bm{\theta}$ will be specified when a specific denoiser is involved. For brevity, we may simplify the notation $\bm{D}(\bm{x}_{B}^{pri};v_{B}^{pri},\bm{\theta})$ to $\bm{D}(\bm{x}_{B}^{pri})$ in circumstances without causing ambiguity. Also, we denote the $i$ th entry of $\bm{x}_{B}^{post}$ as

[TABLE]

With the above replacement, the main challenge is how to calculate the extrinsic message of each $x_{i}$ for Module B, without the prior knowledge of the distribution $p(x_{i})$ . Note that Lines 7 and 8 of Algorithm 1 cannot be used any more since they hold only for the MMSE denoiser.

III-B Extrinsic Messages for a Generic Denoiser

We now describe how to calculate the extrinsic messages for a generic denoiser. Without loss of generality, denote the extrinsic output of Module B by $\bm{x}_{B}^{ext}=\bm{D}^{ext}(\bm{x}_{B}^{pri})$ . We call $\bm{D}^{ext}(\bm{x}_{B}^{pri})$ a extrinsic denoiser. Similarly to Line 8 of Algorithm 1, we construct $\bm{x}_{B}^{ext}$ by a linear combination of the a priori mean and the a posteriori mean:

[TABLE]

where $c$ and $\alpha$ are coefficients to be determined. Clearly, (13) is identical to Line 8 of Algorithm 1 by letting $c=\frac{v_{B}^{ext}}{v_{B}^{post}}$ and $\alpha=\frac{v_{B}^{post}}{v_{B}^{pri}}$ . Here, we require that $c$ and $\alpha$ are chosen such that

(i)

The extrinsic distortion is uncorrelated with the prior distortion, i.e.

[TABLE] 2. (ii)

$\mathrm{E}[\|\bm{x}_{B}^{ext}-\bm{x}\|^{2}]$ is minimized.

From the discussions in Section II-B, the calculation of the extrinsic messages in Lines 8 and 9 satisfies the above two conditions when the MMSE denoiser is employed. Note that (14) is a relaxation of (10) since (10) implies (14) but the converse does not necessarily hold. Later we will see that this relaxation is good for many applications. What remains is to determine $c$ and $\alpha$ satisfying conditions (i) and (ii) for a generic denoiser. This is elaborated in the following.

III-B1 Determining parameter $\alpha$

As mentioned in Section II-B, the input message of Module B can be modeled by (8), where the noise part $n_{B,i}^{pri}$ is independent of $x_{i}$ . Then

[TABLE]

where (15a) follows from (8), and (15b) follows by noting $\mathrm{E}[(\bm{n}_{B}^{pri})^{T}\bm{x}]=0$ . To proceed, we introduce the Stein’s lemma [22] as follows: For a normally distributed random variable $y\sim\mathcal{N}(\mu_{y},\sigma_{y}^{2})$ , and a differentiable function $h:\mathbb{R}\rightarrow\mathbb{R}$ such that $\mathrm{E}[|h^{\prime}(y)|]<\infty$ , we have

[TABLE]

Then

[TABLE]

where $D_{i}^{\prime}(\bm{x}+\bm{n}_{B}^{pri})$ denotes the partial derivative of $D_{i}(\bm{x}+\bm{n}_{B}^{pri})$ with respect to variable $n_{B,i}^{pri}$ and the expectation is taken over the joint probability distribution of $\bm{n}_{B}^{pri}$ and $\bm{x}$ . In the above, (17a) follows from (8) and (13), (17c) from $\mathrm{E}[x_{B,i}^{pri}x_{i}]=0$ and $\mathrm{E}[n_{B,i}^{pri}n_{B,i}^{pri}]=v_{B}^{pri}$ , and (17e) from the Steins’s lemma by letting $y=n_{B,i}^{pri}$ . Combining (14), (15b), and (17), we obtain

[TABLE]

where $\mathrm{div}$ denotes divergence, and $D_{i}^{\prime}(\bm{x}_{B}^{pri})$ is the partial derivative of $D_{i}(\bm{x}_{B}^{pri})$ with respect to $x_{B,i}^{pri}$ . Note that the approximation in (18b) becomes accurate when $n$ is large. Also, with this approximation, the calculation of $\alpha$ does not depend on the distribution of $\bm{x}$ .

By substituting (18) into (13), we obtain

[TABLE]

where the extrinsic denoiser $\bm{D}^{ext}(\cdot)$ is defined as

[TABLE]

The divergence of $\bm{D}^{ext}(\bm{r})$ at $\bm{r}=\bm{x}_{B}^{ext}$ is zero by noting

[TABLE]

Thus, $\bm{D}^{ext}(\cdot)$ belongs to the family of divergence-free denoisers proposed in [11].

III-B2 Determining parameter $c$

Ideally, we want to choose parameter $c$ to satisfy condition (ii) below (14). However, the MSE is difficult to evaluate as the distribution of $\bm{x}$ is unknown. To address this problem, we use the Stein’s unbiased risk estimate (SURE) [22] to approximate the MSE.

To be specific, consider the signal model

[TABLE]

where $\bm{n}\in\mathbb{R}^{n\times 1}$ is the additive Gaussian noise draw from $\mathcal{N}(0,\bm{I})$ . The mean square error of denoiser $\bm{D}(\bm{r})$ is defined by

[TABLE]

The SURE of the MSE of $\bm{D}(\bm{r})$ is given by

[TABLE]

Compared with the MSE in (23), the SURE in (24) does not involve the distribution of $\bm{x}$ . We next use SURE as a surrogate for MSE and tune the denoiser by minimizing the SURE. Recall from (8) that $\bm{x}_{B}^{pri}$ can be represented as $\bm{x}_{B}^{pri}=\bm{x}+\bm{n}_{B}^{pri}$ . Let $\tau=\sqrt{v_{B}^{pri}}$ . Then, applying (24) to $\bm{D}^{ext}(\bm{x}_{B}^{pri})$ , we obtain

[TABLE]

where the second step follows from (21), and the last step from (19). Minimizing the SURE given in (25), we obtain the optimal $c$ given by

[TABLE]

III-C Denoising-based Turbo CS

We are now ready to extend Turbo-CS for a generic denoiser. We refer to the extended algorithm as Denoising-based Turbo-CS (D-Turbo-CS). The details of D-Turbo-CS are presented in Algorithm 2.

Compared with Turbo-CS, D-Turbo-CS has the same operations in Module A. But for Module B, D-Turbo-CS employs a generic denoiser, rather than the MMSE denoiser. Correspondingly, the extrinsic mean is calculated using Line 6 of Algorithm 2; the extrinsic variance is calculated in Line 7 by following Eqn. (71) in [16].

IV Construction of Extrinsic Denoisers

Various denoisers have been proposed in the literature for noise suppression. For example, the SURE-LET [13], the BM3D [23], and the dictionary learning [24] are developed for image denoising; the singular value thresholding (SVT) [25] is used for low-rank matrix denoising. In this section, we study the applications of these denoisers in D-Turbo-CS. We describe how to construct the corresponding extrinsic denoiser $\bm{D}^{ext}(\bm{r};\bm{\theta})$ for any given denoiser $\bm{D}(\bm{r};\bm{\theta})$ . Based on that, we further consider optimizing the denoiser parameter $\bm{\theta}$ .

IV-A Extrinsic SURE-LET Denoiser

We start with the SURE-LET denoiser. A SURE-LET denoiser is constructed as a linear combination of some kernel functions. The combination coefficients are determined by minimize the SURE of the MSE [13].

Specifically, a SURE-LET denoiser is constructed as

[TABLE]

where $\bm{O}\in\mathbb{R}^{n\times n}$ is an orthonormal transform matrix, $\bm{\psi}_{k}:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}$ for $k=1,\cdots,K$ are kernel functions, $\bm{\theta}=[\theta_{1},\theta_{2},\cdots,\theta_{k}]^{T}$ , and $\bm{\Psi}_{k}(\bm{r})=\bm{O}\bm{\psi}_{k}(\bm{O}^{T}\bm{r})$ . $\bm{O}$ can be the Haar wavelet transform matrix or the DCT transfrom matrix.

The choice of kernel functions $\{\bm{\psi}_{k}\}$ depends on the structure of the input signals. For example, the authors in [26] proposed the following piecewise linear kernel functions for sparse signals:

[TABLE]

where $\psi_{k,i}(\bm{r})$ represents the $i$ th element of $\bm{\psi_{k}}(\bm{r})$ , $r_{i}$ is the $i$ th element of $\bm{r}$ , $\beta_{1}$ and $\beta_{2}$ are constants chosen based on the noise level $\tau^{2}$ . The recommended values of $\beta_{1}$ and $\beta_{2}$ can be found in [26].

For SURE-LET denoiser $\bm{D}(\bm{r};\bm{\theta})$ in (27), the corresponding extrinsic denoiser $\bm{D}^{ext}(\bm{r};\bm{\theta})$ is given by

[TABLE]

where (31a) is from (20), and $\theta_{k}^{\prime}=c\theta_{k}$ , for $k=1,\cdots,K$ .

We next determine the optimal $\bm{\theta}^{\prime}=[\theta_{1}^{\prime},\cdots,\theta_{K}^{\prime}]^{T}$ by minimizing the SURE. From (24), the SURE of $\bm{D}^{ext}(\bm{r},\bm{\theta})$ is given by

[TABLE]

where (32b) follows from (21) and (31), (32c) follows from $\bm{\Psi}_{k}(\bm{O}\tilde{\bm{r}})=\bm{O}\bm{\psi}_{k}(\tilde{\bm{r}})$ and $\mathrm{div}\{\bm{\Psi}_{k}(\bm{r})\}=\mathrm{div}\{\bm{\psi}_{k}(\tilde{\bm{r}})\}$ with $\tilde{\bm{r}}=\bm{O}^{T}\bm{r}$ .

The optimal $\bm{\theta}^{\prime}$ that minimizes $\widehat{\mathrm{MSE}}$ in (32) is given by

[TABLE]

where the $(i,j)$ th entry of $\bm{M}\in\mathbb{R}^{K\times K}$ and the $i$ th entry of $\bm{b}\in\mathbb{R}^{K\times 1}$ are respectively given by

[TABLE]

with

[TABLE]

IV-B Extrinsic SVT Denoiser

In many applications, data are arranged in a matrix form. Thus, we rearrange the signal vector $\bm{x}\in\mathbb{R}^{n\times 1}$ into a matrix $\bm{X}\in\mathbb{R}^{n_{1}\times n_{2}}$ with $n_{1}n_{2}=n$ , and consider the recovery of a low-rank $\bm{X}$ from the noisy observation

[TABLE]

where $\tau$ is the noise level, and $\bm{N}$ contains i.i.d. Gaussian noise with zero mean and unit variance. Let $r$ be the rank of $\bm{X}$ . We assume that $\bm{X}$ is a low-rank matrix, i.e. $r\ll n_{1},n_{2}$ . A popular method for low-rank matrix denoising is the so-called singular value thresholding (SVT) [14]:

[TABLE]

where $\theta>0$ is a regularization parameter, $\|\cdot\|_{F}$ denotes the Frobenius norm, and $\|\cdot\|_{\ast}$ denotes the nuclear norm. The singular value decomposition of $\bm{R}$ is given by

[TABLE]

where $\bm{\Sigma}=\text{diag}\{\sigma_{1},\sigma_{2},\cdots,\sigma_{r}\}\in\mathbb{R}^{r\times r}$ , $\bm{U}=[\bm{u}_{1},\bm{u}_{2},\cdots,\bm{u}_{r}]\in\mathbb{R}^{n_{1}\times r}$ satisfies $\bm{U}^{T}\bm{U}=\bm{I}$ , and $\bm{V}=[\bm{v}_{1},\bm{v}_{2},\cdots,\bm{v}_{r}]\in\mathbb{R}^{n_{2}\times r}$ satisfies $\bm{V}^{T}\bm{V}=\bm{I}$ . Then, the SVT denoiser in (37) has the following closed-form expression [14]:

[TABLE]

From [14], the divergence of the SVT denoiser $\text{SVT}(\bm{R};\theta)$ has a closed-form expression given by

[TABLE]

where $n_{m}=\min(n_{1},n_{2})$ . For an SVT denoiser, we construct the extrinsic denoiser $\bm{D}^{ext}(\bm{R};\theta)$ based on (20) as

[TABLE]

where

[TABLE]

We define the MSE and the SURE of $\bm{D}^{ext}(\bm{R};\theta)$ respectively as

[TABLE]

It’s clear that the divergence of $\bm{D}^{ext}(\bm{R};\theta)$ is zero. From (43b), the SURE of the MSE is given by

[TABLE]

The optimal $c$ that minimizes $\widehat{\mathrm{MSE}}$ in (44) is given by

[TABLE]

By substituting $c^{opt}$ into (44), and after some straightforward manipulations, we obtain

[TABLE]

The optimal threshold $\theta$ that minimizes $\widehat{\mathrm{MSE}}$ given in (46) can be obtained by solving the following optimization problem:

[TABLE]

The problem in (47) is non-convex. However, since only one parameter $\theta\in[0,\max(\bm{\sigma})]$ is involved, we can solve (47) by exhaustive search.

IV-C Other Extrinsic Denoisers

Both the SURE-LET denoiser and the SVT denoiser have analytical expressions. However, there are other denoisers that can not be expressed in a closed form. The corresponding extrinsic denoisers also have no analytical expressions. We give two examples as follows.

The first example is the dictionary learning denoiser. Dictionary learning aims to find a sparse representation for a given data set in the form of a linear combination of a set of basic elements. This set of basic elements is called a dictionary. Existing dictionary learning algorithms include K-SVD [27], iterative least square (ILS) [28], recursive least squares (RLS) [29], and the sequential generalization of $K$ -means (SGK) [30]. Based on above dictionary learning algorithms, we can construct dictionary learning denoisers by following the approach in [24]. Specifically, consider a noisy image matrix $\bm{R}\in\mathbb{R}^{n_{1}\times n_{2}}$ , where $n_{1}$ and $n_{2}$ are integers. We reshape $\bm{R}$ into a vector $\bm{r}\in\mathbb{R}^{n\times 1}$ , where $n=n_{1}n_{2}$ . Also, we divide the whole image into blocks of size $n_{3}\times n_{3}$ , and reshape each block $\bm{R}_{i,j}$ into a vector $\bm{r}_{i,j}$ , where $n_{3}$ is an integer satisfying $n_{3}\ll n_{1},n_{2}$ . Note that $\bm{r}_{i,j}$ is related to $\bm{r}$ by $\bm{r}_{i,j}=\bm{E}_{i,j}\bm{r}$ where $\bm{E}_{i,j}\in R^{n_{3}^{2}\times n}$ is the corresponding block extraction matrix. Then we use $\{\bm{r}_{i,j}\}$ as the training set to train a dictionary $\bm{Q}\in{n_{3}^{2}\times n_{4}}$ using any of the dictionary learning algorithms mentioned above, where $n_{4}$ is an integer satisfying $n_{4}>n_{3}^{2}$ . The image block $\bm{r}_{i,j}$ can be expressed approximately as

[TABLE]

where $\bm{\alpha}_{i,j}\in\mathbb{R}^{n_{4}\times 1}$ is the sparse representation of $\bm{r}_{i,j}$ using the dictionary $\bm{Q}$ . Then, we update the whole image vector $\bm{r}$ based on the learned dictionary $\bm{Q}$ and coefficients $\bm{\alpha}_{i,j}$ by averaging the denoised image block vectors as

[TABLE]

where $\lambda$ is a constant depending on the input noise level. Finally, we reshape the image vector $\tilde{\bm{r}}$ back into an image matrix.

The second example is the BM3D denoiser [23]. The denoising process of BM3D is summarized as follows. First, the image matrix $\bm{R}$ is separated into image blocks of size $s_{1}\times s_{1}$ (with $7\leq s_{1}\leq 13$ ). For each image block, similar blocks are found and grouped together into a three-dimensional (3D) data array. Then, collaborative filtering is used to denoise the 3D data arrays. The filtered blocks are then returned back to their original positions. Note that BM3D achieves the state-of-the-art visual quality among all the existing image denoisers.

The above dictionary learning and BM3D denoisers have no close-form expressions, and so the divergences of these denoisers can not be calculated explicitly. Instead, we evaluate their divergences using the Monte Carlo method. Specifically, the divergence of $\bm{D}(\bm{R})$ can be estimated by

[TABLE]

where $\delta$ is a small constant, $\tilde{\bm{N}}\in\mathbb{R}^{n_{1}\times n_{2}}$ is a perturbation matrix with the elements i.i.d. drawn from $\mathcal{N}(0,1)$ , and $\left<\bm{A},\bm{B}\right>=\sum_{i,j}A_{i,j}B_{i,j}$ with $A_{i,j}$ and $B_{i,j}$ be the $(i,j)$ th elements of $\bm{A}$ and $\bm{B}$ , respectively. The expectation in (50) can be approximated by sample average. It is observed in [15] that one sample is good enough for high-dimensional problems.

V Evolution Analysis of D-Turbo-CS

V-A MSE Evolution

The behavior of D-Turbo-CS can be characterized by the so-called MSE evolution. Denote the input normalized mean square error (NMSE) of Module A (or equivalently, the output NMSE of Module B) at iteration $t$ as $v(t)$ , and the output NMSE of Module A (or equivalently, the input NMSE of Module B) at iteration $t$ as $\tau^{2}(t)$ , where NMSE is defined by

[TABLE]

Then, the MSE evolution is characterized by

[TABLE]

where the (52a) follows from Line 3 of Algorithm 2, (52b) follows from the assumption in (8), the expectation in (52b) is taken over $\bm{e}\sim\mathcal{N}(\bm{0},\bm{I})$ , and $v(0)$ is initialized as $\mathrm{E}[\|\bm{x}\|_{2}^{2}]/n$ . We next examine the accuracy of the above MSE evolution.

V-B $\bm{x}$ * with i.i.d. Entries*

We consider the situation of $\bm{x}$ with i.i.d. entries. In simulation, each $x_{i}$ in $\bm{x}$ is Gaussian-Bernoulli distributed with probability density function $p(x_{i})=(1-\rho)\delta(x_{i})+\rho\mathcal{N}(x_{i},0,1/\rho)$ , where $\delta(\cdot)$ is the Dirac delta function. The other settings are: the sparsity rate $\rho=0.27$ , the measurement rate $m/n=0.5$ , the signal length $n=20000$ , and the sensing matrix is chosen as the random partial DCT defined by

[TABLE]

where $\bm{S}\in\mathbb{R}^{m\times n}$ is a random row selection matrix which consists of randomly selected rows from a permutation matrix, and $\bm{W}\in\mathbb{R}^{n\times n}$ is the DCT matrix. In simulation, the SURE-LET denoiser with the kernel functions given in (28)-(30) is employed in D-Turbo-CS and D-AMP, with the corresponding algorithms denoted by LET-Turbo-CS and LET-AMP, respectively.

As shown in Fig. 3, the MSE evolution of LET-Turbo-CS matches well with the simulation. In contrast, for LET-AMP, the state evolution deviates from the simulation. Also, LET-Turbo-CS outperforms LET-AMP111Note that, the performance of LET-AMP here is better than the original LET-AMP [26], because under the condition, LET-AMP diverges, and we replace the estimated variance $\hat{\sigma}^{2}$ in D-AMP with a more robust estimate $\hat{\sigma}^{2}=\sqrt{\frac{1}{\ln 2}}\mathrm{median}(|\hat{x}|)$ given in [31]. considerably and performs close to MMSE-Turbo-CS in which the MMSE denoiser is employed. We also plot the QQplot of the estimation error of $\bm{x}_{B}^{pri}$ at iteration 10 of LET-Turbo-CS in Fig. 4. From the QQplot, we see that $\bm{x}_{B}^{pri}-\bm{x}$ is close to zero-mean Gaussian, which agrees well with the assumption in (8). Later, we will see that the Gaussianity of $\bm{x}_{B}^{pri}-\bm{x}$ is a good indicator of the accuracy of the MSE evolution.

V-C $\bm{x}$ * with Correlated Entries*

In many applications, signals are correlated and the prior distribution is unknown. For example, the adjacent pixels of a natural image are correlated and their distributions are not available. We next study the MSE evolution of D-Turbo-CS for compressive image recovery.

In simulation, we generate signal $\bm{x}$ from the image “Fingerprint” of size $512\times 512$ taken from the Javier Portilla’s dataset [32] by reshaping the image into a vector of size $262144\times 1$ . The denoiser is chosen as the BM3D denoiser, and the corresponding algorithms are denoted as BM3D-Turbo-CS and BM3D-AMP. We set the measurement rate $m/n$ to 0.3.

With the sensing matrix given in (53), the performance of BM3D-Turbo-CS and BM3D-AMP is simulated and shown in Fig. 5. We see that the simulation results of both algorithms do not match with the MSE evolution. Also, we plot the QQplot of the estimation error $\bm{x}_{B}^{pri}-\bm{x}$ in Fig. 6. We see that the distribution of $\bm{x}_{A}^{pri}-\bm{x}$ is not quite Gaussian, and the mean of the distribution is not zero. This interprets the failure of the evolution prediction.

We conjecture that the reason for the degradation of the simulation performance in Fig. 5 is that the correlation in $\bm{x}$ is not appropriately handled. So, we replace $\bm{A}_{1}$ by

[TABLE]

where $\bm{\Theta}$ is a diagonal matrix with the random signs (1 or -1) in the diagonal. The simulation result with sensing matrix $\bm{A}_{2}$ is shown in Fig. 7. We see that now, the MSE evolution of BM3D-Turbo-CS matches well with the simulation. Also, BM3D-Turbo-CS outperforms BM3D-AMP in both converge rate and recovery quality. In Fig 8, the QQplot of the estimation error of $\bm{x}_{B}^{pri}$ at iteration 2 of BM3D-Turbo-CS is plotted. We see that the estimation error is close to zero-mean Gaussian, similarly to the case of i.i.d. $\bm{x}$ . To summarize, the sensing matrix in (53) is good for i.i.d. $\bm{x}$ , while the sensing matrix in (54) is needed when the entries of $\bm{x}$ are correlated.

VI Performance Comparisons

In this section, we provide numerical results of D-Turbo-CS for compressive image recovery and low-rank matrix recovery. For comparison, the recovery accuracy is measured by peak signal-to-noise ratio (PSNR):

[TABLE]

where MAX denotes the maximum possible pixel value of the image.

The stopping criterion of D-Turbo-CS is described as follows. The D-Turbo-CS algorithm stops when its output at iteration $t$ $\hat{\bm{x}}(t)$ satisfies $\frac{\|\hat{\bm{x}}(t)-\hat{\bm{x}}(t-1)\|^{2}}{\|\hat{\bm{x}}(t-1)\|^{2}}\leq\epsilon$ or when it is excecuted for over $T$ iterations, where $\epsilon$ and $T$ are predetermined constants.

VI-A Noiseless Image Recovery

For noiseless compressive image recovery, we consider three denoisers mentioned in Section IV: the SURE-LET denoiser, the BM3D denoiser and the dictionary learning denoiser. The corresponding algorithms of D-Turbo-CS and D-AMP are denoted by LET-Turbo-CS and LET-AMP, BM3D-Turbo-CS and BM3D-AMP, and SGK-Turbo-CS and SGK-AMP. The EM-GM-AMP algorithm in [16] is also included for comparison. The test images are chosen from the Javier Portilla’s dataset, including “Lena”, “Boat”, “Barbara” and, “Fingerprint” in Fig. 9. The settings of $\epsilon$ and $T$ are as follows: $\epsilon=10^{-4}$ and $T=20$ for SURE-LET; $\epsilon=10^{-4}$ and $T=30$ for BM3D; $\epsilon=10^{-4}$ and $T=20$ for SGK.

In Table I, we compare D-Turbo-CS with D-AMP and EM-GM-AMP for noiseless natural image recovery with the sensing matrix given in (54). We see that D-Turbo-CS outperforms D-AMP and EM-GM-AMP for all the test images under almost all measurement rates and denoisers. To compare the reconstruction speed, we further report the reconstruction time of LET-AMP and LET-Turbo-CS in Table II. Both algorithms are run until the stopping criterion is activated. We see that the reconstruction time of LET-Turbo-CS is much less than that of LET-AMP. In Table III, we list the PSNR of reconstructed images using BM3D-AMP and BM3D-Turbo-CS for sensing matrix $\bm{A}_{1}$ and $\bm{A}_{2}$ . From the table, we see that the recovery quality for sensing matrix $\bm{A}_{1}$ is very poor, which is consistent with the observation in Fig. 5. To summarize, D-Turbo-CS has significant advantages over D-AMP and EM-GM-AMP in compressive image recovery in both visual quality and recovery time.

VI-B Low-Rank Matrix Recovery

For low-rank matrix recovery, we use the SVT denoiser. The corresponding algorithms of D-Turbo-CS and D-AMP are denoted respectively by SVT-Turbo-CS and SVT-AMP. The low-rank matrix $\bm{X}$ is generated by the multiplication of two random matrices of size $128\times 10$ and $10\times 128$ , with the elements of the two matrices independently drawn from $\mathcal{N}(0,1)$ .

The NMSE comparison of SVT-Turbo-CS and SVT-AMP under the measurement rate $\delta=m/n=0.48$ with sensing matrix given in (54) is shown in Fig. 10. We see that, SVT-Turbo-CS significantly outperforms SVT-AMP, and the MSE evolution of SVT-Turbo-CS agrees well with the simulation result.

VII Conclutions

In this paper, we developed the D-Turbo-CS algorithm for compressed sensing. We discussed how to construct and optimize the so-called extrinsic denoisers for D-Turbo-CS. D-Turbo-CS does not require prior knowledge of the signal distribution, and so can be adopted in many applications including compressive image recovery and low-rank matrix recovery. Numerical results show that D-Turbo-CS outperforms D-AMP and EM-GM-AMP in terms of both recovery accuracy and convergence speed when partial orthogonal sensing matrices are involved.

Bibliography33

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory , vol. 52, no. 4, pp. 1289–1306, Apr. 2006.
2[2] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological) , pp. 267–288, 1996.
3[3] S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequency dictionaries,” IEEE Trans. Signal Process. , vol. 41, no. 12, pp. 3397–3415, 1993.
4[4] J. A. Tropp, “Greed is good: Algorithmic results for sparse approximation,” IEEE Trans. Inf. Theory , vol. 50, no. 10, pp. 2231–2242, 2004.
5[5] R. D. Nowak, S. J. Wright et al. , “Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems,” IEEE J. Sel. Topics Signal Process. , vol. 1, no. 4, pp. 586–597, 2007.
6[6] D. Needell and J. A. Tropp, “Cosamp: Iterative signal recovery from incomplete and inaccurate samples,” Applied and Computational Harmonic Analysis , vol. 26, no. 3, pp. 301–321, 2009.
7[7] D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algorithms for compressed sensing,” in Proc. Nat. Acad. Sci. , vol. 106, no. 45, Nov. 2009.
8[8] M. Bayati and A. Montanari, “The dynamics of message passing on dense graphs, with applications to compressed sensing,” IEEE Trans. Inf. Theory , vol. 57, no. 2, pp. 764–785, Feb. 2011.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Denoising-based Turbo Compressed Sensing

Abstract

Index Terms:

I Introduction

II Preliminaries

II-A Compressed Sensing

II-B Turbo Compressed Sensing

III Denoising-based Turbo CS

III-A Problem Statement

III-B Extrinsic Messages for a Generic Denoiser

III-B1 Determining parameter α\alphaα

III-B2 Determining parameter ccc

III-C Denoising-based Turbo CS

IV Construction of Extrinsic Denoisers

IV-A Extrinsic SURE-LET Denoiser

IV-B Extrinsic SVT Denoiser

IV-C Other Extrinsic Denoisers

V Evolution Analysis of D-Turbo-CS

V-A MSE Evolution

V-B x\bm{x}x* with i.i.d. Entries*

V-C x\bm{x}x* with Correlated Entries*

VI Performance Comparisons

VI-A Noiseless Image Recovery

VI-B Low-Rank Matrix Recovery

VII Conclutions

III-B1 Determining parameter $\alpha$

III-B2 Determining parameter $c$

V-B $\bm{x}$ * with i.i.d. Entries*

V-C $\bm{x}$ * with Correlated Entries*