Recovery guarantees for compressed sensing with unknown errors
Simone Brugiapaglia, Ben Adcock, Richard K. Archibald

TL;DR
This paper establishes robust recovery guarantees for l1-minimization in compressed sensing without requiring prior noise estimates, enhancing practical applicability in high-dimensional and inverse problem contexts.
Contribution
It provides novel theoretical guarantees for l1-minimization robustness when noise estimates are unknown, applicable to various high-dimensional and inverse problem scenarios.
Findings
Robust recovery guarantees for basis pursuit without noise estimates.
Application to high-dimensional function approximation and inverse problems.
Relevance to MRI and other imaging techniques.
Abstract
From a numerical analysis perspective, assessing the robustness of l1-minimization is a fundamental issue in compressed sensing and sparse regularization. Yet, the recovery guarantees available in the literature usually depend on a priori estimates of the noise, which can be very hard to obtain in practice, especially when the noise term also includes unknown discrepancies between the finite model and data. In this work, we study the performance of l1-minimization when these estimates are not available, providing robust recovery guarantees for quadratically constrained basis pursuit and random sampling in bounded orthonormal systems. Several applications of this work are approximation of high-dimensional functions, infinite-dimensional sparse regularization for inverse problems, and fast algorithms for non-Cartesian Magnetic Resonance Imaging.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Recovery guarantees for compressed sensing with unknown errors
Simone Brugiapaglia1, Ben Adcock2
Department of Mathematics
Simon Fraser University
Burnaby, BC V5A 1S6, Canada
Email: [email protected], [email protected]
Richard K. Archibald
Computer Science and Mathematics Division
Oak Ridge National Laboratory
Oak Ridge, TN 37831, USA
Email: [email protected]
Abstract
From a numerical analysis perspective, assessing the robustness of -minimization is a fundamental issue in compressed sensing and sparse regularization. Yet, the recovery guarantees available in the literature usually depend on a priori estimates of the noise, which can be very hard to obtain in practice, especially when the noise term also includes unknown discrepancies between the finite model and data. In this work, we study the performance of -minimization when these estimates are not available, providing robust recovery guarantees for quadratically constrained basis pursuit and random sampling in bounded orthonormal systems. Several applications of this work are approximation of high-dimensional functions, infinite-dimensional sparse regularization for inverse problems, and fast algorithms for non-Cartesian Magnetic Resonance Imaging.
I Introduction
In Compressed Sensing (CS) and sparse representations we deal with underdetermined linear systems of equations
[TABLE]
where , with , is the sensing matrix, is an unknown signal, and is the vector of measurements perturbed by noise [12, 15]. This corruption could be due to physical noise produced by the measuring device, to approximation errors in the model, or to numerical factors. Some examples are model error in inverse problems such as MRI [19, 22], the expansion error in infinite-dimensional CS when truncating the signal to its finite dimensional representation [1, 4], or the quadrature error involved in the evaluation of the bilinear form associated with a PDE [9, 10, 7].
A standard tool to regularize the inverse problem (25) and recover a good approximation to the solution (assumed to be sparse or compressible) is the Quadratically Constrained Basis Pursuit (QCBP) optimization program
[TABLE]
also called Basis Pursuit (BP) when . Usually, in order to study the recovery guarantees of (2), the parameter is assumed to control the noise magnitude, i.e.,
[TABLE]
Indeed, under the regime (3) and with suitable hypotheses on the sensing matrix (e.g., based on the restricted isometry property), the following type of recovery error estimate holds with high probability
[TABLE]
where is the best -term approximation error of with respect to the -norm [14, 17]. Unfortunately, a priori estimates of the noise of the form (3) may not be available in real applications of CS. Moreover, since the recovery error estimate (4) is sensitive to , the choice of this parameter is crucial (see Figure 1). In practice, one could resort to cross-validation in order to tune this parameter, but this technique could be time-consuming or inaccurate and it is not properly understood from a theoretical perspective [16].
The goal of this work is to establish robust recovery guarantees for QCBP (and BP) under the regime
[TABLE]
In this scenario, recovery estimates analogous to (4)–where is replaced by –hold for BP [17]. They are based on the so-called quotient property, which is known to be satisfied only by random Gaussian matrices [26] and by Weibull matrices [18], under suitable restrictions on the number of measurements . Similar robust recovery estimates are also available for algorithms such as iterative hard thresholding, CoSaMP, and orthogonal matching pursuit [17]. Yet, all these techniques require an a priori knowledge of the sparsity level that is not necessary for QCBP.
Here, we prove robust recovery error estimates for QCBP (and BP) when the matrix is built by random sampling from bounded orthonormal systems [20]. In particular, under suitable hypotheses involving the restricted isometry constants and the singular values of , we provide recovery error estimates in probability of the form (see Theorem 6)
[TABLE]
where the factor is polylogarithmic in and , and can be defined, for example, as in (10). The effect of the unknown error is encapsulated in the third term on the right-hand side (compare with (4)). As is to be expected, this term approaches zero as the estimation of the model error improves.
This analysis has several potential extensions and applications. First, the case of weighted -minimization. This is used notably in high-dimensional function approximation and interpolation [2, 21, 23], with applications in uncertainty quantification for parametric PDEs. Second, fast methods for non-Cartesian MRI, where model error arises from gridding non-uniform Fourier data to a uniform grid, and can seriously hamper reconstruction quality [3, 19]. Third, low-rank matrix recovery.
II Tools from compressed sensing
We first recall some concepts from CS that constitute the foundations that our analysis will be built upon: the null space and the restricted isometry properties (Section II-A), and random sampling in bounded orthonormal systems (Section II-B).
In the following, for every , we define and . Moreover, we denote the set of -sparse vectors in as .
II-A Robust null space and restricted isometry properties
The first tool involved in our analysis is the so-called -robust Null Space Property (NSP) [17, Chapter 4].
Definition 1** (-robust null space property)**
Given , the matrix satisfies the -robust null space property of order (with respect to the norm ) with constants and if, for any set with , it holds
[TABLE]
Moreover, we recall the well-known restricted isometry property (also known as “RIP”), where the sensing matrix is required to behave similarly to an isometry when its action is restricted to the set of sparse vectors [12].
Definition 2** (Restricted isometry property)**
The restricted isometry constant of a matrix is the smallest constant such that
[TABLE]
The matrix has the Restricted Isometry Property (RIP) of order if .
It is well-known that is a sufficient condition for the -robust NSP to hold [17, Theorem 6.13] (we decide to use this condition and not that presented in [11] for ease of exposition).
II-B Bounded orthonormal systems
Our analysis focuses on the case of measurement matrices arising from random sampling from a Bounded Orthonormal System (BOS) [20]. Some significant examples of random sampling from a BOS are the subsampled Fourier transform, nonharmonic Fourier measurements, and random evaluation of orthogonal polynomials. We will discuss these case studies in more detail in Section III-C.
Definition 3** (Random sampling from a BOS)**
Let be endowed with probability measure . Then, a set of complex-valued functions on is called a Bounded Orthonormal System (BOS) with constant if, for every , it holds and . Moreover, given independent random variables , distributed according to , we define the random sampling matrix associated with a the BOS as
[TABLE]
A crucial property of this kind of matrices is the following. For every , assuming
[TABLE]
the restricted isometry constant of satisfies with probability at least , where the factor is polylogarithmic in and and can be chosen in different ways (see [20, 17, 21, 13]). For example, according to [17, Theorem 12.32], it can be defined as follows
[TABLE]
III Robustness analysis
In this section, we illustrate our robustness analysis. For proofs of the results stated here, we refer to [8].
III-A Singular values of tall random matrices
Given a “tall” matrix , we denote and sort its singular values as follows
[TABLE]
First, we provide a robust error estimate for QCBP as defined in (2) assuming the -robust NSP. This estimate depends on the minimum singular value of the Hermitian conjugate of the sensing matrix .
Proposition 4** (Robust error estimate based on the NSP)**
Let be a matrix of rank that satisfies the -robust NSP of order with constants and with respect to (see Definition 1). Then, the following error estimate holds for problem (2)
[TABLE]
where the hidden constant depends on and .
Proposition 4 shows that, in order to get robust error estimates for QCBP, we need to understand the asymptotic behavior of . In particular, our goal is to show that . With this aim, we employ principles and ideas from the theory of random matrices with isotropic heavy-tailed columns [25].
Recall that a random vector is said to be isotropic if . Then, consider random matrices of the form
[TABLE]
where the columns are independent random isotropic vectors. We introduce the cross-coherence parameter, defined as
[TABLE]
This parameter controls the off-diagonal part of the Gram matrix of . We also define the distortion parameter as
[TABLE]
It measures how far the columns of are from being normalized. We notice that if almost surely for every , then vanishes. Using this two parameters, we give a generalization of [25, Theorem 5.62].
Theorem 5** (Singular values of heavy-tailed matrices)**
Let be an matrix () whose columns are independent isotropic random vectors in . Then, its singular values satisfy the following asymptotic estimate
[TABLE]
where and are defined as in (13) and (14), respectively.
We observe that a necessary condition for estimate (15) to be nontrivial is that the quantity should be bounded uniformly in (but not necessarily in ). Regarding the incoherence parameter, exploiting the isotropy of the columns of , and assuming for a suitable constant (this is always the case for , where is the random sampling matrix associated with a BOS), we can prove that
[TABLE]
We check the sharpness of this upper bound for moderate values of numerically in the case of the subsampled Fourier transform, where (see Figure 2).
III-B Recovery error estimate
The following theorem is the main result of the paper. It provides a robust error estimate in probability for QCBP, when the solution is very sparse and is a random sampling matrix associated with a BOS.
Theorem 6** (Robust recovery error estimate for QCBP)**
Consider a BOS with constant and with a distortion parameter that satisfies
[TABLE]
for suitable constants independent of and . Then, there exist constants and a function depending polylogarithmically on and such that the following holds. For every and , assume that the sparsity level satisfies
[TABLE]
and let be the random sampling matrix associated with and with a number of measurements
[TABLE]
Then, the following robust error estimate holds for QCBP
[TABLE]
with probability at least . The constant depends on , , and , whereas the constants , , , and are universal. The function can be defined as in (10), with .
The required relation between and is linear up to logarithmic factors, in accordance with the usual recovery error estimate in CS. However, there are three main limitations of Theorem 6 that are worth underlining. First, the result holds for a particular sparsity regime (18). Essentially, fixed the failure probability , we require (up to logarithmic factors). Second, the error estimate (20) actually depends on , but this dependence is only polylogarithmic (the factor is due to the term in the error estimate of Proposition 4). Third, there is a linear dependence between and in (18). Thus, the failure probability of the estimate is not “overwhelmingly low”. These three issues are open problems currently under investigation.
III-C Applications
To conclude, we discuss some applications of Theorem 6 to concrete examples from signal processing and high-dimensional polynomial approximation.
III-C1 Fourier and Chebyshev BOSs
We discuss two examples of BOSs very popular in CS. First, we consider the subsampled Fourier transform. We have , the system is defined as
[TABLE]
and the sampling measure is the uniform discrete distribution on . It turns out that and, consequently, the distortion parameter is . As a result, condition (17) of Theorem 6 is satisfied.
In the case of the Chebyshev system, we consider the Chebyshev orthogonal polynomials on , defined as
[TABLE]
[TABLE]
They form a BOS with respect to the Chebyshev measure on and with constant . By studying the normalized Christoffel function associated with this system, we estimate that the distortion parameter decays proportionally to . Therefore, hypothesis (17) holds true and we can apply Theorem 6.
III-C2 High-dimensional polynomial approximation
We assess the robustness of QCBP for polynomial approximation in high dimension [1]. Consider the multivariate function
[TABLE]
We fix and we employ the tensorized version of the Chebyshev polynomials defined in (22)-(23) over as a sparsity basis. We set maximum degree on each variable and we restrict the multi-index space to the hyperbolic cross shape [5]. These choices leads to a total of degrees of freedom. We evaluate at random independent sampling points identically distributed according to the tensorized Chebyshev measure over . In order to assess the robustness to unknown error, we artificially add centered gaussian noise with standard deviation to the measurements, namely
[TABLE]
In the case of nonintrusive methods for the uncertainty quantification of PDEs with random parameters, we can interpret this noise as the numerical error associated with the black-box PDE solver used to produce point-wise samples of the quantity of interest [16].
In Figure 3, we plot the absolute error as a function of for different values of the standard deviation . The resulting curve always exhibits a global minimum. We observe that, for , underestimating is better than overestimating it. For , the minimum becomes more pronounced and underestimating becomes more penalizing. Recalling (20) in Theorem 6, this behavior could be justified as follows: for small values of , the recovery error is dominated by the term , whereas, the more gets larger, the more the term becomes dominant.
In order to estimate the value of that minimizes the error, we evaluate the residual on a reference solution computed via least-square fitting over an oversampled random grid of size , i.e.,
[TABLE]
We compare this value with the one computed by cross-validation [16]. Both approaches are able to approximate the minimum quite well for , whereas, in the case , cross-validation slightly underperforms.
III-C3 Non-Cartesian Magnetic Resonance Imaging
Finally, in Figure 4 we give an application of this analysis to non-Cartesian MRI. For fast reconstruction in sparse MRI, non-Cartesian data is often preprocessed by gridding it to a uniform integer grid, thus ensuring that the sampling matrix can be expressed as a subsampled DFT matrix. This introduces model errors, which, as seen in Figure 4, adversely affect the reconstruction. On the other hand, the gridding strategy introduced in [3] leads to model errors of order , where is a user-controlled parameter. Theorem 6 theoretically establishes the advantage of this higher-fidelity gridding, as verified in Figure 4.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] B. Adcock, “Infinite-dimensional ℓ 1 superscript ℓ 1 \ell^{1} minimization and function approximation from pointwise data” Constr. Approx. , (to appear), 2016.
- 2[2] B. Adcock, “Infinite-dimensional compressed sensing and function interpolation”, Found. Comput. Math. , (in revision), 2016.
- 3[3] B. Adcock, R. Archibald, A. Gelb, G. Song, R. B. Platte and E. G. Walsh, “Parameter assessment from time-dependent MR signals using sequential imaging”, preprint, 2016.
- 4[4] B. Adcock and A.C. Hansen, “Generalized sampling and infinite-dimensional compressed sensing,” Found. Comput. Math. , vol. 16, no. 5, pp. 1263–1323, 2016.
- 5[5] K.I. Babenko, “Approximation by trigonometric polynomials in a certain class of periodic functions of several variables.” Dokl. Akad. Nauk SSSR. , Vol. 132, no. 5, 1960.
- 6[6] Brain Web: Simulated Brain Database. http://brainweb.bic.mni.mcgill.ca/brainweb/
- 7[7] S. Brugiapaglia, “C Omp Ressed Solv ING: sparse approximation of PD Es based on compressed sensing”, Ph.D. Thesis, Politecnico di Milano, 2016.
- 8[8] S. Brugiapaglia, B. Adcock, R.K. Archibald, “Robustness to unknown error in sparse regularization”, in preparation , 2017.
