Coherence-Based Performance Guarantee of Regularized $\ell_{1}$-Norm   Minimization and Beyond

Wendong Wang; Feng Zhang; Zhi Wang; and Jianjun Wang

arXiv:1812.03739·cs.NA·December 11, 2018

Coherence-Based Performance Guarantee of Regularized $\ell_{1}$-Norm Minimization and Beyond

Wendong Wang, Feng Zhang, Zhi Wang, and Jianjun Wang

PDF

Open Access

TL;DR

This paper establishes coherence-based guarantees for regularized -norm minimization in signal recovery, extending existing conditions to noisy, block-sparse, and structured signals, with new uniform recovery conditions and error bounds.

Contribution

It extends the sharp uniform recovery condition based on coherence to unconstrained models for robust signal recovery under various noise models, including structured block-sparse signals.

Findings

01

Established coherence-based performance guarantees for minimization with noise.

02

Extended recovery conditions to Dantzig Selector and block-sparse signals.

03

Provided new error estimates and uniform recovery conditions for structured signals.

Abstract

In this paper, we consider recovering the signal $x \in R^{n}$ from its few noisy measurements $b = A x + z$ , where $A \in R^{m \times n}$ with $m ≪ n$ is the measurement matrix, and $z \in R^{m}$ is the measurement noise/error. We first establish a coherence-based performance guarantee for a regularized $ℓ_{1}$ -norm minimization model to recover such signals $x$ in the presence of the $ℓ_{2}$ -norm bounded noise, i.e., $∥ z ∥_{2} \leq ϵ$ , and then extend these theoretical results to guarantee the robust recovery of the signals corrupted with the Dantzig Selector (DS) type noise, i.e., $∥ A^{T} z ∥_{\infty} \leq ϵ$ , and the structured block-sparse signal recovery in the presence of the bounded noise. To the best of our knowledge, we first extend nontrivially the sharp uniform recovery condition derived by Cai, Wang…

Equations155

μ < \frac{1}{2 k - 1},

μ < \frac{1}{2 k - 1},

x \in R^{n} min ∥ x ∥_{0}, s . t . b = A x .

x \in R^{n} min ∥ x ∥_{0}, s . t . b = A x .

x \in R^{n} min ∥ x ∥_{1}, s . t . b = A x,

x \in R^{n} min ∥ x ∥_{1}, s . t . b = A x,

(1 - δ) ∥ x ∥_{2}^{2} \leq ∥ A x ∥_{2}^{2} \leq (1 + δ) ∥ x ∥_{2}^{2}

(1 - δ) ∥ x ∥_{2}^{2} \leq ∥ A x ∥_{2}^{2} \leq (1 + δ) ∥ x ∥_{2}^{2}

b = A x + z,

b = A x + z,

x \in R^{n} min ∥ x ∥_{1}, s . t . ∥ b - A x ∥_{2} \leq ϵ,

x \in R^{n} min ∥ x ∥_{1}, s . t . ∥ b - A x ∥_{2} \leq ϵ,

x \in R^{n} min ∥ x ∥_{1} + \frac{1}{2 λ} ∥ b - A x ∥_{2}^{2},

x \in R^{n} min ∥ x ∥_{1} + \frac{1}{2 λ} ∥ b - A x ∥_{2}^{2},

μ = 1 \leq i < j \leq n max ∣ ⟨ a_{i}, a_{j} ⟩ ∣,

μ = 1 \leq i < j \leq n max ∣ ⟨ a_{i}, a_{j} ⟩ ∣,

x_{[k]} = ar g min_{∥ y ∥_{0} \leq k} ∥ y - x ∥_{2} .

x_{[k]} = ar g min_{∥ y ∥_{0} \leq k} ∥ y - x ∥_{2} .

f_{a} (x) = a x^{2} + 3 a x + 3, g_{a} (x) = 2 a x^{2} + 4 a x + 1.

f_{a} (x) = a x^{2} + 3 a x + 3, g_{a} (x) = 2 a x^{2} + 4 a x + 1.

(1 - (k - 1) μ) ∥ y ∥_{2}^{2} \leq ∥ A y ∥_{2}^{2} \leq (1 + (k - 1) μ) ∥ y ∥_{2}^{2}

(1 - (k - 1) μ) ∥ y ∥_{2}^{2} \leq ∥ A y ∥_{2}^{2} \leq (1 + (k - 1) μ) ∥ y ∥_{2}^{2}

∥ A h ∥_{2}^{2} - 2 ϵ ∥ A h ∥_{2} \leq

∥ A h ∥_{2}^{2} - 2 ϵ ∥ A h ∥_{2} \leq

∥ h_{E^{c}} ∥_{1} \leq ∥ h_{E} ∥_{1} + 2∥ x_{E^{c}} ∥_{1} + \frac{ϵ}{λ} ∥ A h ∥_{2},

∥ h_{E^{c}} ∥_{1} \leq ∥ h_{E} ∥_{1} + 2∥ x_{E^{c}} ∥_{1} + \frac{ϵ}{λ} ∥ A h ∥_{2},

∥ x^{♯} ∥_{1} + \frac{1}{2 λ} ∥ b - A x^{♯} ∥_{2}^{2} \leq ∥ x ∥_{1} + \frac{1}{2 λ} ∥ b - A x ∥_{2}^{2},

∥ x^{♯} ∥_{1} + \frac{1}{2 λ} ∥ b - A x^{♯} ∥_{2}^{2} \leq ∥ x ∥_{1} + \frac{1}{2 λ} ∥ b - A x ∥_{2}^{2},

∥ A h ∥_{2}^{2} - 2 ⟨ z, A h ⟩ \leq 2 λ (∥ x ∥_{1} - ∥ x^{♯} ∥_{1}) .

∥ A h ∥_{2}^{2} - 2 ⟨ z, A h ⟩ \leq 2 λ (∥ x ∥_{1} - ∥ x^{♯} ∥_{1}) .

LHS \geq ∥ A h ∥_{2}^{2} - 2 ϵ ∥ A h ∥_{2} .

LHS \geq ∥ A h ∥_{2}^{2} - 2 ϵ ∥ A h ∥_{2} .

RHS =

RHS =

\geq

\geq

=

μ < \frac{1}{k - 1}

μ < \frac{1}{k - 1}

∥ h_{E} ∥_{2} \leq α_{1} ∥ A h ∥_{2} + α_{2} ∥ h_{E^{c}} ∥_{1},

∥ h_{E} ∥_{2} \leq α_{1} ∥ A h ∥_{2} + α_{2} ∥ h_{E^{c}} ∥_{1},

α_{1} ≜ \frac{1 + ( k - 1 ) μ}{1 - ( k - 1 ) μ} and α_{2} ≜ \frac{k μ}{1 - ( k - 1 ) μ} .

α_{1} ≜ \frac{1 + ( k - 1 ) μ}{1 - ( k - 1 ) μ} and α_{2} ≜ \frac{k μ}{1 - ( k - 1 ) μ} .

1 < α_{1} < 6, k α_{2} < 1 and \frac{1}{1 - k α _{2}} < \frac{1}{1 - ( 2 k - 1 ) μ} .

1 < α_{1} < 6, k α_{2} < 1 and \frac{1}{1 - k α _{2}} < \frac{1}{1 - ( 2 k - 1 ) μ} .

ρ ≜ ∣ ⟨ A h, A h_{E} ⟩ ∣.

ρ ≜ ∣ ⟨ A h, A h_{E} ⟩ ∣.

ρ \geq

ρ \geq

\geq

\geq

\geq

ρ \leq

ρ \leq

μ < \frac{1}{2 k - 1}

μ < \frac{1}{2 k - 1}

∥ A (x^{♯} - x) ∥_{2}

∥ A (x^{♯} - x) ∥_{2}

∥ x^{♯} - x ∥_{2}

C_{1} (α_{1})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNumerical methods in inverse problems · Topology Optimization in Engineering · Matrix Theory and Algorithms

Full text

Coherence-Based Performance Guarantee of Regularized $\ell_{1}$ -Norm Minimization and Beyond

Wendong Wang, Feng Zhang, Zhi Wang, and Jianjun Wang

This work was partially supported by the National Science Foundation of China (Grant Nos. 61273020, 61673015) and the China Postdoctoral Science Foundation (Grant No. 2018M643390). (Corresponding author: Jianjun Wang.) W. D. Wang, F. Zhang, and J. J. Wang are with the School of Mathematics and Statistics, Southwest University, Chongqing 400715, China (e-mail: [email protected]; [email protected]; [email protected]).Z. Wang is with the School of Mathematics and Statistics and the College of Computer & Information Science, Southwest University, Chongqing, 400715, China (e-mail: [email protected]).

Abstract

In this paper, we consider recovering the signal $\bm{x}\in\mathbb{R}^{n}$ from its few noisy measurements $\bm{b}=A\bm{x}+\bm{z}$ , where $A\in\mathbb{R}^{m\times n}$ with $m\ll n$ is the measurement matrix, and $\bm{z}\in\mathbb{R}^{m}$ is the measurement noise/error. We first establish a coherence-based performance guarantee for a regularized $\ell_{1}$ -norm minimization model to recover such signals $\bm{x}$ in the presence of the $\ell_{2}$ -norm bounded noise, i.e., $\|\bm{z}\|_{2}\leq\epsilon$ , and then extend these theoretical results to guarantee the robust recovery of the signals corrupted with the Dantzig Selector (DS) type noise, i.e., $\|A^{T}\bm{z}\|_{\infty}\leq\epsilon$ , and the structured block-sparse signal recovery in the presence of the bounded noise. To the best of our knowledge, we first extend nontrivially the sharp uniform recovery condition derived by Cai, Wang and Xu (2010) for the constrained $\ell_{1}$ -norm minimization model, which takes the form of

[TABLE]

where $\mu$ is defined as the (mutual) coherence of $A$ , to two unconstrained regularized $\ell_{1}$ -norm minimization models to guarantee the robust recovery of any signals (not necessary to be $k$ -sparse) under the $\ell_{2}$ -norm bounded noise and the DS type noise settings, respectively. Besides, a uniform recovery condition and its two resulting error estimates are also established for the first time to our knowledge, for the robust block-sparse signal recovery using a regularized mixed $\ell_{2}/\ell_{1}$ -norm minimization model, and these results well complement the existing theoretical investigation on this model which focuses on the non-uniform recovery conditions and/or the robust signal recovery in presence of the random noise.

Index Terms:

Compressed sensing, regularized $\ell_{1}$ -norm minimization, coherence, Dantzig selector, block sparsity

I Introduction

THE last decade has seen the burgeoning development of Compressed Sensing (CS) [1, 2] and its wide-spread applications in many fields. At the core of CS is the problem of efficiently recovering a sparse signal from a relatively small number of linear measurements. Mathematically, for any given signal $\bm{x}\in\mathbb{R}^{n}$ , we say that it is sparse if most of its entries are zero. More specifically, if it has at most $k$ non-zero entries, i.e., $\|\bm{x}\|_{0}\triangleq|\text{supp}(\bm{x})|\leq k$ , we call it a $k$ -sparse signal. In standard CS, one usually observes the linear measurements of the sparse signal $\bm{x}$ via $\bm{b}=A\bm{x}$ , where $A\in\mathbb{R}^{m\times n}$ ( $m\ll n$ ) is a given measurement matrix. To recover such a sparse signal, a natural idea is to search the sparsest solution among all the possible solutions. This directly leads to the following $\ell_{0}$ -norm minimization problem

[TABLE]

Unfortunately, this problem is NP-hard in general, and hence it is computationally infeasible. Instead, some algorithms which aims to pursue the suboptimal solutions of (1) were proposed, see, e.g., [3, 4, 5] and their variants [6, 7, 8, 9]. Importantly, many of these algorithms have been proved to perform well under certain conditions.

Besides the above algorithm strategies, there also exist many other efficient approaches [10, 11, 12, 13, 14, 15] which can circumvent the NP-hardness of (1), and a popular one is the constrained $\ell_{1}$ -norm minimization method which solves

[TABLE]

where $\|\cdot\|_{1}$ is the $\ell_{1}$ -norm of vector. Problem (2) is convex and therefore can be well addressed by many convex optimization softwares. To theoretically investigate the equivalence between (1) and (2), one often adopts the Restricted Isometry Constant (RIC) of matrix with $k$ order, denoted by $\delta_{k}$ , which is defined to be the smallest value of $\delta\in(0,1)$ such taht

[TABLE]

for every $k$ -sparse vector $\bm{x}$ . This notation was first proposed by Candès and Tao in [10], where they have shown that (2) is equivalent to (1) in noiselessly recovering any $k$ -sparse signals when $\delta_{k}+\delta_{2k}+\delta_{3k}<1$ . Subsequently many researchers were committed to improving this condition, see, e.g., [16, 17, 18, 19, 20, 21, 22, 23]. In more general application scenarios of CS, one often wishes to recover the original signal $\bm{x}$ (may not be exactly sparse) from the noisy observation $\bm{b}$ with

[TABLE]

where $\bm{z}\in\mathbb{R}^{n}$ is the unknown measurement noise/error, which directly leads to the following optimization problem

[TABLE]

where $\epsilon\geq 0$ . Obviously, (4) will reduces to (2) if one takes $\epsilon=0$ . It should also be noted that the above-mentioned exact recovery conditions are still available to guarantee the robust recovery of signals from (4) in the presence of noise.

On the other hand, when used in many practical applications, particularly the applications where the input data are in large scale, the constrained problem (4) is not always convenient to solve. Instead, one often solves its unconstrained counterpart, i.e., the ( $\ell_{2}$ -norm) regularized $\ell_{1}$ -norm minimization problem

[TABLE]

where $\lambda$ is the nonnegative tradeoff parameter. This problem is also known as the Lasso estimator [24] or the Basis Pursuit DeNoising (BPDN) [25] , and it can also be solved efficiently by many algorithms, see, e.g., [26, 27, 28, 29, 30]. Recently, the relation between (4) and (5) was carefully investigated by Zhang, Yuan and Yin [31] in the context of the non-uniform recovery [32], i.e., the recovery of some specific sparse signals, for example, the sparse signals limited in a specific support. However, when it comes to the uniform recovery [31, 32], i.e., recovering all the (general) sparse signals 111In these signals, both the number and support of their non-zero entries are not known in advance., it is generally believed that (4) and (5) are not exact equivalents. As early as 2008, Zhu [33] has derived the RIC-based theoretical guarantee for (5), which states that one can robustly recover any $k$ -sparse signal $\bm{x}$ through (3) with $\|\bm{z}\|_{2}=\epsilon$ by using (5) under certain $\lambda$ , if $A$ obeys $\delta_{4k}+2\delta_{5k}<1$ . However, this work was relatively rarely noted by the researchers. In 2009, Bickel, Ritov and Tsybakov [34] established a RIC-like guarantee for (5) in the presence of the random noise. Recently, some new RIC conditions were obtained to ensure the robust recovery of some unconstrained $\ell_{1}$ -analysis approaches under the Dantzig Selector (DS) type noise/error (i.e., $\|A^{T}\bm{z}\|_{\infty}\leq\epsilon$ ), see [35, 36, 37] for details.

Some random matrices represented by sub-Gaussian random matrices are proved to have a small RIC with overwhelmingly high probability [32]. However, when used in practical scenarios they often suffer from storage and computation limitations. Moreover, it is also NP-hard in general to find the RIC of any given matrix. To overcome these difficulties, some researchers proposed to reuse the mutual coherence, another powerful tool introduced by Mallat and Zhang [38] in their initial research on matching pursuit. In this paper, denoting by $\bm{a}_{i}\in\mathbb{R}^{m}$ the $i$ th column of matrix $A$ , we shall define the mutual coherence of matrix $A\triangleq[\bm{a}_{1},\bm{a}_{2},\cdots,\bm{a}_{n}]$ as

[TABLE]

where we assume that $\bm{a}_{i}$ obeys $\|\bm{a}_{i}\|_{2}=1$ for $i=1,2,\cdots,n$ . Many deterministic measurement matrices in fact are designed according to the mutual coherence. There are many coherence-based theoretical results for (4), see, e.g., [39, 40, 41, 42, 43, 44]. In particular in [44], Cai, Wang and Xu have shown that any signals (not necessary to be $k$ -sparse) can be robustly recovered using (4), if $A$ satisfies $\mu<1/(2k-1)$ , and this condition is also sharp for the noiseless recovery of any $k$ -sparse signals through (2).

As far as we know, the first (mutual) coherence-based result on (5) was given by Fuchs [41] in 2004 under the non-uniform recovery setting, which states that any fixed signal $\bm{x}$ with $k$ non-zero entries (i.e., $\|\bm{x}\|_{0}=k$ ) can be uniquely recovered from $\bm{b}=A\bm{x}$ using (5), if $A$ satisfies $\mu<1/(2k-1)$ and $\lambda$ in (5) has been taken small enough. Later, Fuchs [45] further investigated (5) for the noisy signal recovery, and shown that if $\bm{b}$ is observed through (3) with $\|\bm{x}\|_{0}=k$ and $\|\bm{z}\|_{2}\leq\epsilon$ , $A$ satisfies $\mu\leq c/k$ for certain $c\leq 1/2$ , and $\bm{x}^{\sharp}$ is assumed to be the optimal solution of (5) under certain $\lambda$ (related to $\mu$ , $c$ and $\epsilon$ ), then the support of $\bm{x}^{\sharp}$ will be either identical to, or contained in, that of $\bm{x}$ . Moreover, Fuchs also shown that if similar constraints are imposed on $\bm{x}$ , $\bm{z}$ and $\lambda$ , $\bm{x}^{\sharp}$ and $\bm{x}$ will have their non-zero entries at the same support and with the same signs. Subsequently, Tropp [46] further extended the results in [45] to more general case. In 2010, Ben-Haim, Eldar and Elad [47] revisited (5) under the random noise, and their obtained coherence results have been proved to be better than those induced in [34]. Note that the above coherence results apply only to deal with the signals whose sparsity is known in advance. Recently, using the cumulative coherence [48] tool, Li and Chen [49] established a new uniform recovery condition for (5) to deal with the signal recovery in the presence of noise. Their results shown that if $A$ obeys $\mu\leq 1/(\sqrt{3}(5k-2))$ , one can robustly recover any signals corrupted with the DS type noise. However, the noise they considered is based on the DS type noise rather than the often used $\ell_{2}$ -norm bounded noise, and the recovery condition they obtained still has much room to improve.

In this paper, equipped with the powerful mutual coherence tool, we investigate the performance guarantees of (5) and its some variants. In summary, the contributions of this paper are listed as follows:

•

We establish a tight uniform recovery condition and two relatively tight error estimates for (5), which are sufficient to guarantee the robust recovery of signals corrupted with the $\ell_{2}$ -norm bounded noise.

•

We extend the obtained theoretical results to guarantee the robust recovery of the signals corrupted with the DS type noise and also the structured block-sparse signal recovery in the presence of the bounded noise. To the best of our knowledge, these extended results are established for the first time under the uniform recovery setting.

The remainder of this paper is organized as follows. Section II introduces some notations and preliminaries. In Section III, we present the main results. Section IV shows two extensions. Finally, conclusion and future work are given in Section V.

II Notations and Preliminaries

II-A Notations

Throughout this paper, we denote $[r]\triangleq\{1,2,\cdots,r\}$ for any given integer $r$ , and $E^{c}=[n]\setminus E$ for any given index set $E\subset[n]$ . We also denote $\bm{h}_{E}$ as a vector whose entries $(\bm{h}_{E})_{i}=\bm{h}_{i}$ for $i\in E$ and 0 otherwise, and $\|\cdot\|_{a}^{b}=(\|\cdot\|_{a})^{b}$ where $\|\cdot\|_{a}$ represents certain norm or quasi-norm. For any signal $\bm{x}\in\mathbb{R}^{n}$ , we denote its best $k$ -term approximate as $\bm{x}_{[k]}$ , which is defined as

[TABLE]

Besides, for the simplicity of symbol expression we introduce the following two functions

[TABLE]

II-B Three key lemmas

The proof of our main results heavily relies on the following three lemmas. We start with introducing the first lemma, which provides a RIC-like coherence result for any given matrix.

Lemma 1 ([46, 42]).

Assume that the matrix $A\in\mathbb{R}^{m\times n}$ is standardized to have unit $\ell_{2}$ -norm. Then it holds that

[TABLE]

for all $k$ -sparse signals $\bm{y}\in\mathbb{R}^{n}$ .

Lemma 2.

If $\bm{b}$ is observed via (3) with $\|\bm{z}\|_{2}\leq\epsilon$ , then for any subset $E\subset[n]$ with $|E|=k$ , the optimal solution $\bm{x}^{\sharp}$ of (5) satisfies

[TABLE]

and

[TABLE]

where $\bm{h}=\bm{x}^{\sharp}-\bm{x}$ .

Proof:

Since $\bm{x}^{\sharp}$ is the optimal solution of (5), we have

[TABLE]

which is equivalent to

[TABLE]

As to the Left-Hand Side (LHS) of (9), we have

[TABLE]

As to the Right-Hand Side (RHS) of (9), we know

[TABLE]

Therefore combing (9), (10) and (II-B) directly leads to (7), and (8) follows trivially from (7). ∎

Lemma 3.

If the matrix $A\in\mathbb{R}^{m\times n}$ is standardized to have unit $\ell_{2}$ -norm, and obeys

[TABLE]

for certain integer $k\geq 2$ , then for any vector $\bm{h}\in\mathbb{R}^{n}$ and any subset $E\subset[n]$ with $|E|=k$ , it holds that

[TABLE]

where

[TABLE]

Remark 1.

It is easy to know from Lemma 3 that both $\alpha_{1}$ and $\alpha_{2}$ are two monotone increasing functions on variable $\mu$ . Therefore if one restricts $\mu<1/(2k-1)$ , it will be clear that

[TABLE]

Proof:

The proof is simple. We start with estimating the lower and upper bounds of

[TABLE]

First, using Lemma 1, we know

[TABLE]

Next we estimate the upper bound of $\rho$ . It follows from 1 that

[TABLE]

Now, combing (II-B), (16) and the condition (12) directly leads to the desired inequality (13). ∎

III Main Results

With preparations above, we now present our main results.

Theorem 1.

For any $\bm{b}$ observed via (3) with $\|\bm{z}\|_{2}\leq\epsilon$ , if the measurement matrix $A$ , which is standardized to have unit $\ell_{2}$ -norm, satisfies

[TABLE]

for certain integer $k\geq 2$ , then we have

[TABLE]

where $\bm{x}^{\sharp}$ is the optimal solution of (5), and

[TABLE]

Remark 2.

Theorem 1 shows that one can robustly recover any signals (may not be $k$ -sparse) corrupted with the $\ell_{2}$ -norm bounded noise, if the measurement matrix $A$ satisfies (17). To the best of our knowledge, we first extend this sharp uniform recovery condition 222The sharp condition/bound here and throughout will refer to $\mu<1/(2k-1)$ . It has been shown in [44] that, any $k$ -sparse signal $\bm{x}$ , without exception, can be exactly recovered from (2) under this sharp condition, and there exists a matrix $A$ with $m<n$ obeying $u=1/(2k-1)$ , and two nonzero $k$ -sparse vectors $\bm{\widetilde{x}}$ and $\bm{\widehat{x}}$ with disjoint supports such that $A\bm{\widetilde{x}}\neq A\bm{\widehat{x}}$ . derived by Cai, Wang and Xu in [44] for the constrained problem (2) to its unconstrained counterpart, i.e., the unconstrained problem (5). Similar to [36, 37], if we associate $\epsilon$ with $\lambda$ , e.g., setting $\epsilon=\lambda$ , then we get a special case of Theorem 1, and one can find this result in Corollary 1. In Remark 3, we will analyze the tightness of these two error estimates under the setting of $\epsilon=\lambda$ . Besides, it is also very easy to induce some other special cases of Theorem 1 to cope with several different sparse recovery tasks. For examples, one can consider the robust recovery of any exactly sparse signals, i.e., setting the original signals $\bm{x}$ to be exactly $k$ -sparse. The detailed analysis of these cases will become very similar to that of Corollary 1.

Corollary 1.

Assume that $\bm{b}$ is observed via (3) with $\|\bm{z}\|_{2}\leq\lambda$ . If the measurement matrix $A$ is standardized to have unit $\ell_{2}$ -norm, and also satisfies (17) for certain integer $k\geq 2$ , then we have

[TABLE]

where $\bm{x}^{\sharp}$ is the optimal solution of (5), and

[TABLE]

Remark 3.

Due to the existence of $\alpha_{1}$ and $\alpha_{2}$ , the coefficients $\widehat{C}_{1},\cdots,\widehat{C}_{4}$ are not convenient to be analyzed. Fortunately, based on the previous estimates for $\alpha_{1}$ and $\alpha_{2}$ , i.e., (14), we can give a rough but simple estimate for each $\widehat{C}_{i}$ . Specifically,

[TABLE]

These upper bound estimates of coefficients make our recovery error, denoted by RE, have the form of

[TABLE]

where RE stands for $\|A(\bm{x}^{\sharp}-\bm{x})\|_{2}$ or $\|\bm{x}^{\sharp}-\bm{x}\|_{2}$ , and $\overline{C}_{1}$ and $\overline{C}_{2}$ depend only on the value of $1-(2k-1)\mu$ , which characterizes the gap between the coherence of the selected measurement matrix $A$ and its sharp bound. This result also coincides with the ones obtained in [35, 36, 37, 49] for (5) in form. However, one should note that the authors in these literature focus on sparse recovery corrupted with the DS type noise, which is totally different from ours. Despite this, our obtained upper bound estimates to some degree are still better than theirs since a much tighter (or sharp) recovery condition is used. What’s more, some coefficients in these estimates can be further improved if one optimizes some inequalities used to prove Theorem 1.

Proof:

We first assume that $\mu<1/(k-1)$ for certain integer $k\geq 2$ and denote

[TABLE]

then using Lemma 2, Lemma 3, we have

[TABLE]

where we used $\|\bm{h}_{E}\|_{1}\leq\sqrt{k}\|\bm{h}_{E}\|_{2}$ . We can known from the condition (17) that

[TABLE]

Therefore we can further write (III) as

[TABLE]

which implies that

[TABLE]

This completes (18). Based on (18), (8) and (13), we have

[TABLE]

Besides, using (III) together with (8) and (18) again, we can estimate $\|\bm{h}_{E^{c}}\|_{1}$ as

[TABLE]

On the other hand, let $E_{1}$ be the index set of the $k$ largest entries of $\bm{h}_{E^{c}}$ . Then we know from [36] that

[TABLE]

Similarly, using Lemma 3 again on index $E_{1}$ , we also have

[TABLE]

This, together with (18) and (III), directly leads to

[TABLE]

Now, combining (III), (22) and (III), we can estimate $\|\bm{h}\|_{2}$ as follows:

[TABLE]

which completes the proof. ∎

IV Extensions

In this section, two extensions of Theorem 1 are discussed. They include extending Theorem 1 to guarantee the robust recovery of signals from a DS regularized $\ell_{1}$ -norm minimization model in the presence of the DS type noise, and that of the structured block-sparse signals from two regularized mixed $\ell_{2}/\ell_{1}$ -norm minimization models in the presence of the bounded noise. We start with introducing the DS regularized $\ell_{1}$ -norm minimization model for signal recovery in the presence of the DS type noise.

IV-A Robust recovery via a DS regularized $\ell_{1}$ -norm minimization

The research on the DS type noise was initiated by Candès and Tao in [50], which aims at recovering the signals corrupted with the DS type noise by solving the following constrained problem

[TABLE]

Many remarkable results on this problem have been obtained over the past decade, see, e.g., [34, 18, 20, 21, 22, 23, 44]. Similar to the relation of (4) and (5), a closely related problem to (25) is the following DS regularized $\ell_{1}$ -norm minimization problem:

[TABLE]

Inspired by Theorem 1, we also establish a uniform recovery condition and two relatively tight error estimates for (26) to guarantee the robust signal recovery in the presence of such kind of noise, see Theorem 2 for details. This new theorem as well as Theorem 1, to the best of our knowledge, first extends the sharp uniform recovery condition obtained in [44] for (2) to its two unconstrained variants, i.e., (5) and (26), to deal with the signals corrupted with the $\ell_{2}$ -norm bounded noise and the DS type noise, respectively. In what follows, we present this theorem.

Theorem 2.

For any $\bm{b}$ observed via (3) with $\|A^{T}\bm{z}\|_{2}\leq\epsilon$ , if the measurement matrix $A$ , whose columns are standardized to have unit $\ell_{2}$ -norm, satisfies

[TABLE]

for certain integer $k\geq 2$ , then we have

[TABLE]

where $\bm{x}^{\sharp}$ here denotes the optimal solution of (26).

Remark 4.

In general, it is usually suggested to recover the signals corrupted with the DS type noise using the constrained problem (25), see, e.g., [50, 21, 22, 23]. Recently, some researchers proposed to deal with such kind of noise using the unconstrained problem (5), see, e.g., [35, 36, 37, 49], and they also developed a series of recovery conditions and error estimates to realize the robust recovery from (5). However, these results are far from the best. Take for example the mutual coherence condition 333Their original condition in fact is obtained under the cumulative coherence notation. However, whether this condition is sharp was not discussed by the authors. recently obtained in [49], which takes the form of

[TABLE]

Obviously (28) is rigorously included in our sharp condition (27). In the aspect of algorithm implementation, since (26) is convex, many convex optimization softwares are available to solve it efficiently. Besides, compared to the regularization term (i.e., the second term of the objective function) in (5), the one in (26) is non-smooth and thus non-differentiable. However, if one solves (26) using some non-gradient algorithms, such as the alternating direction method and multipliers [29, 51], (26) is still comparable to (5) in terms of the algorithmic complexity.

Proof:

The proof is very similar to that of Theorem 1, and hence we here only present some technique differences. Our proof also relies on Lemma 1 and the variants of Lemma 2 and Lemma 3. One should keep in mind that the term $\|A\bm{h}\|_{2}$ will be replaced by $\|A^{T}A\bm{h}\|_{\infty}$ . Specifically, (7), (8) and (13) are replaced in order by the following inequalities

[TABLE]

These, as well as the skills in proving Theorem 1, are sufficient to prove Theorem 2. ∎

IV-B Structure block-sparse recovery

Our Theorem 1 can still be extended to guarantee the robust recovery of the structured block-sparse signals. Such a kind of signals (data) arise in many applications [52, 53, 54]. We assume w.l.o.g. that there are $l$ blocks with block size $d=n/l$ in signal $\bm{x}\in\mathbb{R}^{n}$ , and then we can write any signal $\bm{x}\in\mathbb{R}^{n}$ as

[TABLE]

where $\bm{x}[i]\in\mathbb{R}^{d}$ denotes the $i$ th block sub-vector of $\bm{x}$ . If $\bm{x}$ has at most $k$ non-zero blocks, i.e., $\|\bm{x}\|_{2,0}\leq k$ , we refer to such a vector $\bm{x}$ as block $k$ -sparse signal. Naturally, a block $k$ -sparse signals will reduce the traditional $k$ -sparse signal if one takes $d=1$ . Accordingly, we can also write any matrix $A\in\mathbb{R}^{m\times n}$ as

[TABLE]

where $A[i]\in\mathbb{R}^{m\times d}$ denotes the $i$ th block sub-matrix of $A$ . To recover such a structured block-sparse signal, Eldar amd Mishali [55] proposed solving the following mixed $\ell_{2}/\ell_{1}$ -norm minimization problem:

[TABLE]

where $\|\bm{x}\|_{2,1}\triangleq\sum_{i=1}^{l}\|\bm{x}[i]\|_{2}$ , and they also derived a block-RIC recovery condition for (29). More improved block-RIC conditions can be found in [56, 57, 58, 59]. As early as 2010, Eldar, Kuppinger and Bölcskei [60] have generalized the traditional mutual coherence to the block setting, and show that any block $k$ -sparse signal $\bm{x}$ can be exactly recovered via (29) if $A$ obeys

[TABLE]

where $\mu_{B}$ and $\nu$ are called block coherence and sub-coherence, respectively, and they are defined as

[TABLE]

Obviously, (30) will reduce to (17) if one lets the block sub-matrix $A[i]$ be orthonormal 444 This means that $(A[i])^{T}A[i]=I_{d}$ , where $I_{d}$ stands for a $d\times d$ identity matrix. for all $i\in[l]$ , namely, $\nu=0$ , and also sets the block size $d=1$ , see, e.g., [61, 62] for more discussion on block coherence and its related theoretical investigation. Equipped with the block coherence, we here consider extending Theorem 1 to guarantee the robust recovery of such structured block-sparse signals corrupted with the bounded noise by solving the following unconstrained problem

[TABLE]

This problem sometimes is called the group Lasso [63], and it can also be viewed as the block (group) extension of (5). One can find our second extension of Theorem 1 as follows.

Theorem 3.

For any $\bm{b}$ observed via (3) with $\|\bm{z}\|_{2}\leq\epsilon$ , if the measurement matrix $A$ , whose every block sub-matrix $A[i]$ for $i\in[l]$ is orthonormal, satisfies

[TABLE]

for certain integer $k\geq 2$ , then we have

[TABLE]

where $\bm{x}^{\sharp}$ denotes the optimal solution of (31), $\bm{x}_{\{k\}}$ denotes the best $k$ -block approximate of $\bm{x}$ , defined as

[TABLE]

and $\beta_{1}$ and $\beta_{2}$ are defined as

[TABLE]

Remark 5.

The idea of using the block coherence and some other tools to deal with the structured block-sparse signals has inspired fruitful results, see, e.g., [64, 65, 66, 67, 68, 69, 70]. However, most of these theoretical results focused on the constrained optimization problems rather than their unconstrained counterparts. We note that the authors in [64] and [69] have established some block coherence based theoretical results for an adaptive group Lasso model. Although (31) is included in this adaptive group Lasso model, the results in Theorem 3 are not included in, and are in fact totally different from theirs since Theorem 3 is established in the context of the uniform recovery setting, and the block sparsity requirement of signals is not needed any more when one uses our Theorem 3, which makes the stable and/or robust recovery of structured block-sparse signals more flexible. Note that one can also extend (26) to the block setting and develop a similar theorem with Theorem 3 to deal with the structured block-sparse signals corrupted with the DS type noise.

Proof:

The proof is very similar to that of Theorem 1, and it relies on the variants of Lemma 1, Lemma 2 and Lemma 3. First, (6) will be replaced by

[TABLE]

where $\bm{y}$ represents any block $k$ -sparse signal. In fact one can prove it easily using the similar skills in proving Lemma 1. Besides, (7), (8) and (13) will also be replaced in order by the following inequalities

[TABLE]

where $E$ denotes the block index set over the $k$ blocks with the largest $\ell_{2}$ norm of the original signal $\bm{x}$ , and $\beta_{1}$ and $\beta_{2}$ are defined in Theorem 3. These, as well as the skills in proving Theorem 1, are sufficient to prove Theorem 3. ∎

V Conclusion and Future work

In this paper, equipped with the powerful mutual coherence tool, we investigated the robust signal recovery using some unconstrained models. We first shown that, if the measurement matrix satisfies $\mu<1/(2k-1)$ , one can robustly recover any signal (not necessary to be $k$ -sparse) corrupted with the $\ell_{2}$ -norm bounded noise using a regularized $\ell_{1}$ -norm minimization model (5). Then we extended this result to guarantee the robust recovery of the signals corrupted with the DS type noise using a DS regularized $\ell_{1}$ -norm minimization model (26). To the best of our knowledge, these two kinds of results first extend the sharp uniform recovery condition obtained in [44] for (2) (to guarantee the exact recovery of any $k$ -sparse signals) to its two unconstrained variants to guarantee the robust recovery of the signals corrupted with the $\ell_{2}$ -norm bounded noise and the DS type noise, respectively. Finally, we considered extending these results to deal with the robust recovery of the structured block-sparse signals corrupted with the bounded noise using some regularized mixed $\ell_{2}/\ell_{1}$ -norm minimization models.

There still exists much work to be done in future. Some potential work includes rebuilding the obtained theoretical results using the mutual coherence tool, extending these recovery conditions to guarantee the robust signal recovery in the presence of the random noise, and establishing the coherence-based performance guarantees of some unconstrained convex/nonconvex models for robust vector/matrix/tensor recovery.

Bibliography70

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inf. Theory , vol. 52, no. 2, pp. 489–509, Feb. 2006.
2[2] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory , vol. 52, no. 4, pp. 1289–1306, Apr. 2006.
3[3] J. A. Tropp and A. C.Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Trans. Inf. Theory , vol. 53, no. 12, pp. 4655–4666, Dec. 2007.
4[4] T. Blumensath and M. E. Davies, “Iterative hard thresholding for compressive sensing,” Appl.Comput. Harmon. Anal. , vol. 27, no. 3, pp. 265–274, 2009.
5[5] H. Mohimani, M. Babaie-Zadeh, and C. Jutten, “A fast approach for overcomplete sparse decomposition based on smoothed ℓ 0 subscript ℓ 0 \ell_{0} norm,” IEEE Trans. Signal Process. , vol. 57, no. 1, pp. 289–301, Jan. 2009.
6[6] D. Needell and R. Vershynin, “Signal recovery from incomplete and inaccurate measurements via regularized orthogonal matching pursuit,” IEEE J. Sel. Topics Signal Process. , vol. 4, no. 2, pp. 310–316, Apr. 2010.
7[7] D. L. Donoho, Y. Tsaig, I. Drori, and J.-L. Starck, “Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit,” IEEE Trans. Inf. Theory , vol. 58, no. 2, pp. 1094–1121, Feb. 2012.
8[8] N. B. Karahanoglu and H. Erdogan, “A ∗ orthogonal matching pursuit: Best-first search for compressed sensing signal recovery,” Digit. Signal. Process. , vol. 22, no. 4, pp. 555–568, Jul. 2012.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Coherence-Based Performance Guarantee of Regularized ℓ1\ell_{1}ℓ1​-Norm Minimization and Beyond

Abstract

Index Terms:

I Introduction

II Notations and Preliminaries

II-A Notations

II-B Three key lemmas

Lemma 1** ([46, 42]).**

Lemma 2**.**

Proof:

Lemma 3**.**

Remark 1**.**

Proof:

III Main Results

Theorem 1**.**

Remark 2**.**

Corollary 1**.**

Remark 3**.**

Proof:

IV Extensions

IV-A Robust recovery via a DS regularized ℓ1\ell_{1}ℓ1​-norm minimization

Theorem 2**.**

Remark 4**.**

Proof:

IV-B Structure block-sparse recovery

Theorem 3**.**

Remark 5**.**

Proof:

V Conclusion and Future work

Coherence-Based Performance Guarantee of Regularized $\ell_{1}$ -Norm Minimization and Beyond

Lemma 1 ([46, 42]).

Lemma 2.

Lemma 3.

Remark 1.

Theorem 1.

Remark 2.

Corollary 1.

Remark 3.

IV-A Robust recovery via a DS regularized $\ell_{1}$ -norm minimization

Theorem 2.

Remark 4.

Theorem 3.

Remark 5.