Recovery of Structured Signals From Corrupted Non-Linear Measurements
Zhongxing Sun, Wei Cui, and Yulong Liu

TL;DR
This paper proposes an extended Lasso method to recover structured signals from a limited number of corrupted non-linear measurements, providing theoretical conditions for successful reconstruction of both signal and corruption.
Contribution
Introduction of an extended Lasso approach for disentangling signals and corruption in non-linear measurement models with theoretical recovery guarantees.
Findings
Successful recovery conditions established
Extended Lasso effectively separates signal and corruption
Applicable to various structured signal models
Abstract
This paper studies the problem of recovering a structured signal from a relatively small number of corrupted non-linear measurements. Assuming that signal and corruption are contained in some structure-promoted set, we suggest an extended Lasso to disentangle signal and corruption. We also provide conditions under which this recovery procedure can successfully reconstruct both signal and corruption.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Blind Source Separation Techniques · Electrical and Bioimpedance Tomography
Recovery of Structured Signals From Corrupted Non-Linear Measurements
Zhongxing Sun and Wei Cui
School of Information and Electronics
Beijing Institute of Technology
Beijing 100081, China
Email: {zhongxingsun, cuiwei}@bit.edu.cn
Yulong Liu
School of Physics
Beijing Institute of Technology
Beijing 100081, China
Email: [email protected]
Abstract
This paper studies the problem of recovering a structured signal from a relatively small number of corrupted non-linear measurements. Assuming that signal and corruption are contained in some structure-promoted set, we suggest an extended Lasso to disentangle signal and corruption. We also provide conditions under which this recovery procedure can successfully reconstruct both signal and corruption.
I Introduction
Throughout science and engineering, one is often faced with the challenge of recovering a structured signal from a relatively small number of linear observations
[TABLE]
where is the sensing matrix, is the desired structured signal, and is the random noise. The objective is to estimate from given knowledge of and . Since this problem is generally ill-posed, tractable recovery is possible when the signal is suitably structured. A general model to encode signal structure is to assume that belongs to some set . For example, to promote sparsity (or low-rankness) of the solution, one can choose to be a scaled (or nuclear norm) ball. Then the signal can be recovered by solving the following -Lasso problem:
[TABLE]
The performance of -Lasso (and its variants) under linear measurements has been extensively studied in the literature, see e.g., [1, 2, 3, 4] and references therein.
However, in many applications of interest the linear model may not be plausible. Important examples include -bit compressed sensing [5] and generalized linear models [6]. In these scenarios, measurements can be approached with the semiparametric single index model [7, 8]
[TABLE]
where are independent copies of an unknown non-linear map (or it may be deterministic) and denote rows of . In a seminal paper [9], Plan and Vershynin present a theoretical analysis for -Lasso under the non-linear observation model (2). Their results show that non-linear observations behave as scaled and noisy linear observations, and under suitable conditions, a scaled original signal can be recovered by -Lasso.
This work extends that of [9] to a more challenging setting, in which the non-linear measurements are corrupted by an unknown but structured vector , i.e.,
[TABLE]
This model is motivated by some practical applications:
- •
Clipping or saturation noise: signal clipping or saturation frequently appears in power-amplifiers and analog-to-digital converters (ADC) because of the limited range in the devices [10, 11]. In those cases, one always measures rather than , where is typically a nonlinear map. And saturation occurs when the input exceeding the maximum or minimum device output. Unlike the white noise or quantization error, the saturation can be unbounded. However, it will be sparse provided the clipping level is high enough, which means the model (3) is appropriate. The elimination of saturation effect may be difficult in a broad class of radar and sonar systems [12].
- •
State estimation for electrical power networks: non-linear measurements caused by device constraints are sent to the central control center in powers networks. These measurements may contain gross errors or outliers modeled by structured corruptions which have arbitrary amplitude due to system malfunctions. So state estimation in power networks needs to detect and eliminate these large measurement errors [13, 14, 15, 16].
In particular, if is the identity function, the model (3) reduces to the standard corrupted sensing problem [17, 18, 19, 20, 21, 22].
Assume that belongs to some set which is meant to capture structures of signal and corruption. A natural method to disentangle signal and corruption is to minimize the loss subject to a geometric constraint:
[TABLE]
This procedure might be regarded as an extension of -Lasso [9].
The goal of this paper is to investigate the performance of -Lasso (4) under the model (3). To this end, we require some model assumptions:
- •
Gaussian measurements: we assume that rows of are i.i.d. Gaussian vectors, i.e., . Note that the factor in the model (3) makes the columns of both and have the same scale, which helps our theoretical results to be more interpretable.
- •
Unit norm of the signal: without loss of generality, we assume that because the norm of may be absorbed into the non-linear function .
- •
Sub-Gaussian distribution of : we assume that are sub-Gaussian variables as in [23]. To understand this assumption, note that is Gaussian, will be sub-Gaussian provided that does not grow faster than linearly, namely, for some scalars and .
Under the above assumptions, we establish theoretical guarantees for -Lasso (4) under corrupted non-linear measurements (3). Our results demonstrate that under proper conditions, it is possible to disentangle signal and corruption in this quite challenging scenario.
II Preliminaries
In this section, we review some preliminaries which underlie our analysis. Hereafter, and denote the unit sphere and ball in under the norm respectively. We use the notation to refer to absolute constants whose value may change from line to line.
II-A Convex Geometry
The tangent cone of a set at is defined as
[TABLE]
The tangent cone may also be called the descent cone.
The Gaussian width and the Gaussian complexity of a set are, respectively, defined as
[TABLE]
and
[TABLE]
These two geometric quantities are closely related to each other [24]:
[TABLE]
The local Gaussian width of a set is a function of parameter defined as
[TABLE]
II-B High-Dimensional Probability
A random variable is called a sub-Gaussian random variable if the sub-Gaussian norm
[TABLE]
is finite. A random vector in is sub-Gaussian random vector if all of its one-dimensional marginals are sub-Gaussian random variables. The sub-Gaussian norm of is defined as
[TABLE]
A random vector in is isotropic if .
II-C A Useful Tool
In the proofs of our main results, we make heavy use of the following matrix deviation inequality, which implies a tight lower bound for the restricted singular value of the extended sensing matrix .
Fact 1** (Extended Matrix Deviation Inequality, [22]).**
Let be an matrix whose rows are independent centered isotropic sub-Gaussian vectors with , and be a bounded subset of . Then for any , the event
[TABLE]
holds with probability at least , where denotes the radius of .
In particular, when is a subset of or , Fact 1 implies that the event
[TABLE]
holds with probability at least , or the event
[TABLE]
holds with probability at least .
III Main Results
Before stating our result, we need to introduce two nonlinearity parameters, which are essentially the intrinsic mean and variance associated with the nonlinear map . Let be a standard normal random variable, the two parameters are defined as [9]:
[TABLE]
We then present two main results, one considers the case when the signal lies at an extreme point of , and the other assumes that lies in the interior of .
Theorem 1**.**
Let be the solution to -Lasso (4). Suppose that , , and that are centered sub-Gaussian random variables with sub-Gaussian norm . Assume that , and let . If
[TABLE]
then, for any , the event
[TABLE]
holds with probability at least .
Remark 1* (Relation to corrupted sensing).*
If is the identity function, then we have , and . Thus Theorem 1 implies that if , -Lasso (4) succeeds with high probability, which is consistent with the constrained recovery results in [17, Theorem 1] and [22, Theorem 2].
Note that is the effective dimension of the descent cone . When lies on the boundary of , which might lead to a narrow descent cone and hence a small effective dimension, then Theorem 1 becomes quite reasonable: a good estimation is guaranteed if the number of observations exceeds the effective dimension of , which may be much smaller than the ambient dimension . However, when is an interior point of , the descent cone is the entire space, the effective dimension is of the order of the ambient dimension . In this case, the results in Theorem 1 become meaningless. The following theorem deals with this situation. As it turns out that local Gaussian width serves as a new measure to characterize the low dimension structure of set which is unnecessary to be a cone.
Theorem 2**.**
Let be the solution to -Lasso (4). Suppose that , , and that are centered sub-Gaussian random variables with sub-Gaussian norm . Assume that and let is a star shaped set111 is a star shaped set if it satisfies for any . Specially, any convex set containing origin is star shaped.. If
[TABLE]
then, for any , the event
[TABLE]
holds with probability at least .
Remark 2* (Local Gaussian width).*
Note that if we let , then goes to , which is of the order of . Then the results in Theorem 2 are exact what in Theorem 1 when is an interior point of . This suggests that Theorem 1 can be regarded as an extreme case of Theorem 2, and local Gaussian width can better characterizes the low dimension structure of sets than Gaussian width.
Remark 3* (Relation to results in [9]).*
Theorems 1 and 2 show that the recovery error can be diminished to an arbitrarily small degree provided that the number of measurements is large enough. Specially, in the corruption-free case (i.e., without the term in the high-probability bounds), our results also agree with Theorem and Theorem in [9].
IV Proofs of Main Results
Before proving Theorems 1 and 2, we require two useful lemmas.
Lemma 1**.**
Suppose that and are centered sub-Gaussian random variables with sub-Gaussian norm . Assume is a star shaped set and let . Then, for any , the event
[TABLE]
holds with probability at least .
Proof.
See Appendix A. ∎
Lemma 2**.**
Let be a star shaped set and . Suppose that . Then, the following lower bound
[TABLE]
holds for all satisfying with probability at least 1-\exp\big{(}-\gamma(\mathcal{K}\cap t\mathbb{S}^{n+m-1})^{2}/t^{2}\big{)}.
Proof.
Let and . Then . Thus we have
[TABLE]
holds with probability at least 1-\exp\big{(}-\gamma(\mathcal{K}\cap t\mathbb{S}^{n+m-1})^{2}/t^{2}\big{)}. The first inequality holds because is star shaped, then . The second inequality follows from (7). The third inequality holds because (5) and , i.e.,
[TABLE]
The last inequality follows from the assumption on the number of measurements . ∎
IV-A Proof of Theorem 1
Proof.
For clarity, the proof is divided into three steps.
Step 1: Problem reduction. Since is the solution to the -Lasso problem (4) and , then we have
[TABLE]
Recall that , then . Let and . Then (12) can be reformulated as
[TABLE]
Squaring both sides of (13) yields
[TABLE]
Step 2: Lower Bound on . Define the error set
[TABLE]
in which the error vector lives. Clearly, belongs to the tangent cone . It then follows from (6) that the event
[TABLE]
holds with probability at least . The second inequality holds because (5) and , namely
[TABLE]
The last inequality is due to (10).
Step 3: Upper Bound on . It follows Lemma 1 that (by setting ) the event
[TABLE]
holds with probability at least .
Putting everything together and taking union bound, we have that, with probability at least 1-2\exp(-cs^{2}\sigma^{4}/{(\psi+\mu)^{4}})-\exp\big{(}-\gamma(\mathcal{D}\cap\mathbb{S}^{n+m-1})^{2}\big{)},
[TABLE]
Rearranging completes the proof of Theorem 1. ∎
IV-B Proof of Theorem 2
Proof.
First note that if , then Theorem 2 holds trivially. So it is sufficient to prove Theorem 2 under assumption .
Similar to Step 1 of the proof of Theorem 1, we have
[TABLE]
Observe that the error vector belongs to a star shaped set, namely . It then follows from Lemma 2 that the following event
[TABLE]
holds with probability at least 1-\exp\big{(}-\gamma(\mathcal{K}\cap t\mathbb{S}^{n+m-1})^{2}/t^{2}\big{)}.
Combining (15) and (16) yields
[TABLE]
Note that , we cannot use the upper bound in Lemma 1 directly. So dividing both sides of (17) by , we obtain
[TABLE]
holds with probability at least 1-2\exp(-cs^{2}\sigma^{4}/{(\psi+\mu)^{4}})-\exp\big{(}-\gamma(\mathcal{K}\cap t\mathbb{S}^{n+m-1})^{2}/t^{2}\big{)}. In the second inequality we set . The third inequality holds due to is star shaped, namely and hence . In the fourth line we let . The last inequality follows from Lemma 1. Thus we complete the proof. ∎
V Conclusion
In this paper, we have analyzed performance guarantees for -Lasso which is used to recover a structured signal from corrupted non-linear Gaussian measurements. The theoretical results may be of help in some practical applications such as dealing with saturation error in quantization which has been a challenge in the area of signal processing. As for future work, it is worthwhile to deduce the explicit expressions of the main results for different specific problems, and to consider penalized recovery procedures rather than a constrained one for computational purposes.
Appendix A Proof of Lemma 1
A-A Auxiliary Definitions and Facts
To prove Lemma 1, we require some additional definitions and facts.
Definition 1** (Sub-exponential random variable and vector).**
A random variable is called a sub-exponential random variable if the sub-exponential norm
[TABLE]
is finite. A random vector in is called sub-exponential random vector if all of its one-dimensional marginals are sub-exponential random variables. The sub-exponential norm of is defined as
[TABLE]
Fact 2** (Sub-Gaussian distributions with independent coordinates).**
[25*, Lemma 3.4.2]**
Let be a random vector with independent, mean zero, sub-Gaussian coordinates . Then is a sub-Gaussian random vector, and*
[TABLE]
Fact 3** (Product of sub-Gaussian is sub-exponential).**
[25, Lemma 2.7.7]** Let and be sub-Gaussian random variables (not necessarily independent). Then is sub-exponential. Moreover,
[TABLE]
Fact 4** (Centering).**
[25*, Lemma 2.6.8 and Exercise 2.7.10]**
If is sub-Gaussian (or sub-exponential), then so is . Moreover,*
[TABLE]
Fact 5** (Bernstein-type inequality).**
[25*, Theorem 2.8.2]**
Let be independent, mean-zero, sub-exponential random variables, and . Then, for any , we have*
[TABLE]
where .
Fact 6** (Gaussian concentration).**
[25*, Theorem 5.2.2]**
Consider a random vector and a Lipschitz function with Lipschitz norm (with respect to the Euclidean metric). Then for any , we have*
[TABLE]
Fact 7** (Talagrand’s Majorizing Measure Theorem).**
[26*, Theorem 2.2.27]** or [24, Theorem 8]
Let be a random process indexed by points in a bounded set . Assume that the process has sub-Gaussian increments, that is, there exists such that*
[TABLE]
Then, for any , the event
[TABLE]
holds with probability at least , where denotes the diameter of .
A-B Proof of Lemma 1
We are now in position to prove Lemma 1. Observe that
[TABLE]
So it suffices to bound the two terms on the right side. To this end, we have the following two lemmas.
Lemma 3**.**
Under the settings of Lemma 1, then for any , the event
[TABLE]
holds with probability at least .
Proof.
See Appendix B. ∎
Lemma 4**.**
Under the settings of Lemma 1, the event
[TABLE]
holds with probability at least .
Proof.
Note that are i.i.d. centered sub-Gaussian variables with -norm
[TABLE]
Then by Fact 2, is a sub-Gaussian random vector with
[TABLE]
Define the random process , which has sub-Gaussian increments:
[TABLE]
Note that , it then follows from Talagrand’s Majorizing Measure Theorem (Fact 7) that the event
[TABLE]
holds with probability at least . The last inequality holds because . Setting yields the desired results. ∎
Thus, combing Lemma 3 and Lemma 4 yields the proof of Lemma 1, namely, for any , the event
[TABLE]
holds with probability at least
[TABLE]
In the last inequality we have used the facts that and .
Appendix B Proof of Lemma 3
The proof of Lemma 3 is inspired by [23]. For clarity, the proof is divided into the following three steps.
Step 1: Problem Reduction. Since are not independent of , to facilitate the analysis, we need to “decouple” them as much as possible. To this end, we consider the orthogonal decomposition of the vectors along the direction of and its orthogonal complementary space. More precisely, we express
[TABLE]
where and . Thus we have
[TABLE]
Step 2: Bound . Define \xi_{i}:=\bm{z}_{i}\left\langle\bm{\Phi}_{i},\bm{x}^{\star}\right\rangle=\big{[}f(\left\langle\bm{\Phi}_{i},\bm{x}^{\star}\right\rangle)-\mu\left\langle\bm{\Phi}_{i},\bm{x}^{\star}\right\rangle\big{]}\left\langle\bm{\Phi}_{i},\bm{x}^{\star}\right\rangle. By the definition of , it is not hard to check that . Note that have sub-Gaussian norm (see (18)) and . It then follows from Fact 3 that are i.i.d. centered sub-exponential variables with . Let . A Bernstein-type inequality (Fact 5) implies that
[TABLE]
holds with probability at least
[TABLE]
In the last inequality we have used the facts that and .
Step 3: Bound . Let . By the orthogonal decomposition (19), and are independent [23, Lemma 8.1]. Fixing , a direct calculation shows that
[TABLE]
where . Thus, conditioning on , .
Note that are sub-exponential variables with mean and -norm . By Fact 4, are centered sub-exponential variables with -norm . A similar application of Bernstein-type inequality (Fact 5) yields that
[TABLE]
holds with probability at least
[TABLE]
Here we have used the fact that again. Therefore, with probability at least ,
[TABLE]
We next bound using Gaussian concentration. Since , the function has Lipschitz norm at most . Indeed,
[TABLE]
where we choose such that .
Therefore, Gaussian concentration inequality (Fact 6) implies that
[TABLE]
holds with probability at least . The second inequality holds because
[TABLE]
where the inequality follows from the independence of and and Jensen’s inequality.
Taking union bound yields, with probability at least 1-2\exp\Big{(}-\frac{cm\sigma^{4}}{K^{4}}\Big{)}-\exp(-c\epsilon^{2}m),
[TABLE]
Putting everything together, we conclude that, for any (noting that ),
[TABLE]
holds with probability at least
[TABLE]
Here we have used again that and . Thus we complete the proof.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] V. Chandrasekaran, B. Recht, P. A. Parrilo, and A. S. Willsky, “The convex geometry of linear inverse problems,” Found. Comut. Math. , vol. 12, no. 6, pp. 805–849, 2012.
- 2[2] J. A. Tropp, “Convex recovery of a structured signal from independent random linear measurements,” in Sampling Theory, a Renaissance . Springer, 2015, pp. 67–101.
- 3[3] R. Vershynin, “Estimation in high dimensions: a geometric perspective,” in Sampling theory, a renaissance . Springer, 2015, pp. 3–66.
- 4[4] C. Thrampoulidis, S. Oymak, and B. Hassibi, “Recovering structured signals in noise: least-squares meets compressed sensing,” in Compressed Sensing and its Applications . Springer, 2015, pp. 97–141.
- 5[5] P. T. Boufounos and R. G. Baraniuk, “1-bit compressive sensing,” in Information Sciences and Systems, 2008. CISS 2008. 42nd Annual Conference on . IEEE, 2008, pp. 16–21.
- 6[6] P. Mc Cullagh, “Generalized linear models,” European Journal of Operational Research , vol. 16, no. 3, pp. 285–292, 1984.
- 7[7] H. Ichimura, “Semiparametric least squares (sls) and weighted sls estimation of single-index models,” Journal of Econometrics , vol. 58, no. 1-2, pp. 71–120, 1993.
- 8[8] J. L. Horowitz and W. Härdle, “Direct semiparametric estimation of single-index models with discrete covariates,” Journal of the American Statistical Association , vol. 91, no. 436, pp. 1632–1640, 1996.
