Simultaneous Sparse Recovery and Blind Demodulation
Youye Xie, Michael B. Wakin, Gongguo Tang

TL;DR
This paper introduces a novel method for simultaneous sparse signal recovery and blind demodulation, leveraging atomic norm minimization, with theoretical guarantees and practical validation in noisy and noiseless scenarios.
Contribution
It proposes a new approach using lifting and atomic norm minimization for joint sparse recovery and blind demodulation with theoretical sample complexity bounds.
Findings
Near optimal sample complexity bounds for perfect recovery.
Effective recovery in noisy conditions demonstrated.
Theoretical analysis supported by numerical simulations.
Abstract
The task of finding a sparse signal decomposition in an overcomplete dictionary is made more complicated when the signal undergoes an unknown modulation (or convolution in the complementary Fourier domain). Such simultaneous sparse recovery and blind demodulation problems appear in many applications including medical imaging, super resolution, self-calibration, etc. In this paper, we consider a more general sparse recovery and blind demodulation problem in which each atom comprising the signal undergoes a distinct modulation process. Under the assumption that the modulating waveforms live in a known common subspace, we employ the lifting technique and recast this problem as the recovery of a column-wise sparse matrix from structured linear measurements. In this framework, we accomplish sparse recovery and blind demodulation simultaneously by minimizing the induced atomic norm, which in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Simultaneous Sparse Recovery and Blind Demodulation
Youye Xie, Michael B. Wakin, and Gongguo Tang
Department of Electrical Engineering, Colorado School of Mines, USA The authors are with the Department of Electrical Engineering, Colorado School of Mines, Golden, CO 80401, USA, e-mails: [email protected], [email protected]@mines.edu, [email protected].
Abstract
The task of finding a sparse signal decomposition in an overcomplete dictionary is made more complicated when the signal undergoes an unknown modulation (or convolution in the complementary Fourier domain). Such simultaneous sparse recovery and blind demodulation problems appear in many applications including medical imaging, super resolution, self-calibration, etc. In this paper, we consider a more general sparse recovery and blind demodulation problem in which each atom comprising the signal undergoes a distinct modulation process. Under the assumption that the modulating waveforms live in a known common subspace, we employ the lifting technique and recast this problem as the recovery of a column-wise sparse matrix from structured linear measurements. In this framework, we accomplish sparse recovery and blind demodulation simultaneously by minimizing the induced atomic norm, which in this problem corresponds to the block norm minimization. For perfect recovery in the noiseless case, we derive near optimal sample complexity bounds for Gaussian and random Fourier overcomplete dictionaries. We also provide bounds on recovering the column-wise sparse matrix in the noisy case. Numerical simulations illustrate and support our theoretical results.
Index Terms:
Sparse recovery, blind demodulation, atomic norm minimization, sparse matrix recovery
I Introduction
I-A Overview
111Parts of the results in this paper were presented at the 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 [1].
In classical sparse recovery and compressive sensing problems, a system observes where , , and are the sensing matrix, dictionary matrix, and sparse signal coefficient vector, respectively. The goal is to recover the sparse vector from the measurements . Usually and are known, but the whole system is under-determined. This model arises naturally in a wide range of applications such as medical imaging [2], seismic imaging [3], video coding [4], and network traffic monitoring [5].
In the special case where is diagonal and contains a carrier signal or the Fourier coefficients of a known source signal along in its diagonal entries, can be viewed as a modulated version of the signal [6] or the Fourier transform of the convolution between two source signals [7]. Recovering can thus be viewed as a demodulation (or deconvolution) problem. Unfortunately, in problems like super resolution [8] and self-calibration [9], the modulation matrix is unknown a priori, as it incorporates the unknown point spread functions or calibration parameters. Recovering and jointly is a simultaneous sparse recovery and blind demodulation problem.
In this paper, we consider a more general sparse recovery and blind demodulation problem in which each atom comprising the signal undergoes a distinct modulation process. Under the assumption that the modulating waveforms live in a known common subspace, we employ the lifting technique and recast this problem as the recovery of a column-wise sparse matrix from structured linear measurements. In this framework, we recover the sparse coefficient vector and all of the modulating waveforms simultaneously by minimizing the induced atomic norm [10, 11], which in this problem corresponds to the block norm minimization and we also refer to it as the norm minimization.
I-B Setup and Notation
To better illustrate our main contributions and compare to related work, we first define our signal model and the corresponding atomic norm minimization problem.
Throughout this paper, we use bold uppercase, , bold lowercase, , and non-bold letters, , to represent matrices, vectors, and scalars. We use , and to denote respectively complex conjugate, matrix Hermitian, and matrix transpose. The symbol denotes a constant. (, resp.) is a matrix (vector, resp.) that zeros out the columns (entries, resp.) not in . We call the support of the matrix (and vector ), and we use to denote the sub-matrix after removing the zero rows or columns in . when and otherwise. . We use to indicate the spectral norm, which returns the maximum singular value of a matrix. The norm of a matrix , denoted by , is defined to be . The inner product between vectors and matrices are defined as and respectively.
I-C Problem Formulation
In this paper, we study a generalized sparse recovery and blind demodulation problem in which the coefficient vector is unknown and each atom (column) of the dictionary undergoes an unknown modulation process. Specifically, we assume the system receives a composite signal
[TABLE]
where is an unknown scalar, is an unknown diagonal modulation matrix, and is the -th atom from a known dictionary with . Our goal is to recover both and for all from the measurement .
To make this problem well-posed, among the over-complete atoms, we assume only of them contribute to the observed signal; that is, at most coefficients are nonzero. We furthermore assume that each modulation matrix obeys a subspace constraint:
[TABLE]
where () is a known basis for the -dimensional subspace of possible modulating waveforms, and is an unknown coefficient vector. Similar subspace assumptions have been made in deconvolution and demixing papers [12, 13]. With this assumption, recovering and equals to recovering and . Since for any , without loss of generality, we assume has unit norm and with its complex phase and sign absorbed by .
Define and note that the -th entry of the observed signal can be expressed as
[TABLE]
where , and is the -th column of the identity matrix. From (I.3), we see that the measurement vector depends linearly on the matrix which encodes all of the unknown parameters of interest. We denote this linear sensing process as and recast the recovery problem as that of recovering (and its components) from the linear measurements.
The unknown matrix can be viewed as a linear combination of rank- matrices from the atomic set and thus we propose to recover using the corresponding atomic norm minimization:
[TABLE]
The atomic norm appearing in (I.4) is defined as . Moreover, the following result establishes its equivalence with the norm.
Proposition 1**.**
The atomic norm optimization problem (I.4) can be equivalently expressed as the following norm optimization problem
[TABLE]
where and represents the following linear sensing process
[TABLE]
in which and are the -th column of and .
Proof.
We first note that the atomic norm can be equivalently expressed as . To see this, consider any decomposition of of the form with . Define and write . This is equivalent to writing where and . Finally, note that .
Next, to establish the equivalence with the norm, for any and with , define and . Then
[TABLE]
Finally, to establish the equivalence of the linear sensing process, (I.3) indicates that for ,
[TABLE]
∎
The above optimization focuses on recovering the structured matrix from linear measurements. Once the optimization is solved, the unknown parameters can be easily extracted from the solution as follows:
[TABLE]
for and .
The adjoint of the linear operator is . The linear operator also has a matrix-vector multiplication form. Note that , where is
[TABLE]
in which and is the -th column of . Furthermore,
[TABLE]
where .
Finally, we note that the observed signal could be contaminated with noise. In this case, our measurement model becomes
[TABLE]
for some unknown noise vector which we suppose satisfies . In this case, we can write , where is the ground truth solution. As an alternative to equality-constrained norm minimization (I.5), we then consider the following relaxation:
[TABLE]
I-D Applications of The Proposed Signal Model
The proposed signal model encompasses a wide range of applications. We briefly introduce some of them as follows.
I-D1 Direction of arrival estimation for antenna array
We first consider the direction of arrival (DOA) estimation problem in antenna array. Assume we have a linear array antenna consisting of elements, and we want to estimate the DOAs of several sources from a snapshot of the received signal. In addition, we consider the narrowband scenario and confine the the array and the far-field sources to a common plane as described in [14]. In this case, the DOA is determined by the azimuth angle, , of the source, which ranges from [math] to degrees. Mathematically, after discretizing the azimuth angle into grids, the observervation of the array can be represented as [15]
[TABLE]
where is the diagonal matrix capturing the unknown calibration of the array elements [9]. Particularly, the calibration issue may arise from gain discrepancies caused by the change of temperatures and humidity of the environment [9]. Namely, the channel is not ideal. One can simulate different scenarios and collect many possible calibration vectors. By applying the singular value decomposition (SVD) on the matrix formed by those calibration vectors, we can then extract the subspace matrix, , with desired dimensions to approximate the calibration using where is the unknown coefficient vector. is the known array manifold matrix whose columns for are the steering vectors. For uniformly spaced linear array antenna (ULA), where is the distance between array elements and is the radar operating wavelength [16]. Moreover, the entries of indicate the strength of the impinging signals and if there exists sources, only entries of are nonzero. consists of the discretization error, approximation error, and additive noise.
Furthermore, let us consider a more severe while realistic situation, where the calibration is sensitive to the direction of arrival which implies that the channel responses from different angles are slightly different. So that the calibration matrix, , are different for different . In this case, we can write
[TABLE]
I-D2 Super-resolution for single molecule imaging
Another application is the single molecule imaging [17] via stochastic optical reconstruction microscopy (STORM) [18]. In this application, the cellular structure of the object of interest is dyed with fluorophores, and STORM divides the imaging process into thousands of cycles. Within each cycle or observation, only a portion of the fluorophores are activated and imaged. Therefore, a typical observation is a low-resolution frame with its activated fluorophores convolved with the non-stationary point spread functions of the microscope, which can be represented as
[TABLE]
where is a vectorized, imaged frame downsampled from its super-resolution image with pixels, represents the intensity of the activated fluorophores, and is the subspace that the point spread functions live in. , which indicates the location of the activated flurophores, is the -th column of the identity matrix and denotes the noise. Moreover, can also be represented equivalently as
[TABLE]
where is the inverse discrete Fourier transform (DFT) operator, with , and s are the DFT of spikes containing the location information. . The goal of this application is to recover the super-resolution image from its low-resolution frame , or mathematically, locating the nonzero .
Other applications that fit into the model investigated in this work include frequency estimation with damping that appears in nuclear magnetic resonance spectroscopy [19] with damping signals approximately living in a common subspace [8] and the CDMA system with spreading sequence sensitive channel as described in Section 6.4 of [9].
I-E Main Contributions
Our contributions are twofold. First, we employ norm minimization to achieve sparse recovery and blind demodulation simultaneously given the generalized signal model from equation (I.1). Second, for perfect recovery of all parameters in the noiseless case, we derive near optimal sample complexity bounds for the cases where is a random Gaussian and a random subsampled Fourier dictionary. Both of bounds require the number of measurements to be proportional to the number of degrees of freedom, , up to log factors. We also provide bounds on recovering the column-wise sparse matrix in the noisy case; these bounds show that the recovery error scales linearly with respect to the strength of the noise.
I-F Related Work
The norm has been widely used to promote sparse recovery in multiple measurement vector (MMV) problems [20, 21]. The MMV problem involves a collection of sparse signal vectors that are stacked as the rows of a matrix . These signals have a common sparsity pattern, which results in a column-wise sparse structure for . As in our setup, the norm is used to recover from linear measurements of the form . However, has a block diagonal structure where all diagonal sub-matrices are the same which is the dictionary matrix. This is different from the structure of the linear measurements in our problem; see for example (I.8).
Our work is also closely related to certain recent works in model-based deconvolution, self-calibration, and demixing. When all in (I.1) are the same, our signal model coincides with the self-calibration problem in [9], although that work employs norm minimization rather than norm minimization to recover . A more recent paper [22] does apply the norm for the self-calibration problem but again assumes a common modulation matrix . The paper [12] generalizes the work of [9] and considers a blind deconvolution and demixing problem which can be interpreted as the self-calibration scenario with multiple sensors whose calibration parameters might be different. However, the signal model in that paper is not directly comparable to our model, and the recovery approach studied in that paper involves nuclear norm minimization and requires knowledge of the number of sensors. A blind sparse spike deconvolution is studied in [13], wherein the dictionary consists of sampled complex sinusoids over a continuous frequency range and all atoms undergo the same modulation. Inspired by [13], [8] generalizes the model to the case of different modulating waveforms. Like [13], however, [8] also considers a sampled sinusoid dictionary over a continuous frequency range, and it employs a random sign assumption on the coefficient vectors which makes it difficult to derive recovery guarantees with noisy measurements. More works considering a common modulation process can be found in [7, 23, 24].
Our work can be viewed as a generalization of the self-calibration [9] and blind deconvolution problems [7]. Moreover, our analysis is quite different from the works considering the continuous sinusoid dictionary [13, 8], since the tools in those papers are specialized to the continuous sinusoids dictionary and we consider discrete Gaussian and random Fourier dictionaries in both noiseless and noisy settings.
The rest of the paper is organized as follows. In Section II, we present our main theorems regarding perfect parameter recovery in the noiseless setting and matrix denoising in the noisy setting. Sections III and IV contain the detailed proofs of the main theorems. Several numerical simulations are provided in Section V to illustrate the critical scaling relationships, and we conclude in Section VI.
II Main Results
We present our main theorems in this section. In each of the noisless and noisy cases, we consider two models for the dictionary matrix . In the first model, is a real-valued random Gaussian matrix, with each entry sampled independently from the standard normal distribution. In the second model, is a complex-valued random Fourier matrix, with each of its rows chosen uniformly with replacement from the discrete Fourier transform matrix where . Our first theorem concerns perfect parameter recovery in the noiseless setting.
Theorem II.1**.**
(Noiseless case) Consider the measurement model in equation (I.1), assume that at most coefficients are nonzero, and furthermore assume that the nonzero coefficients are real-valued and positive. Suppose that each modulation matrix satisfies the subspace constraint (I.2), where and each has unit norm 222Theorem II.1 actually works for with arbitrary norms as long as the relative scale between and is known. .
Then the solution to problem (I.5) is the ground truth solution —which means that , , and can all be successfully recovered for each using (I.7)—with probability at least
- •
if is a random Gaussian matrix and
[TABLE]
- •
if is a random Fourier matrix and
[TABLE]
where .
In both cases, is a constant defined for and the coherence parameter
[TABLE]
We note that both of the sample complexity bounds in Theorem II.1 require the number of measurements to be proportional to the number of degrees of freedom, , up to log factors. We also note that the sample complexity bounds scale with the square of the coherence parameter . Under the assumption which requires the columns of to be orthonormal, . Specifically, given the system parameters with large enough , (II.1) is satisfied when . The valid range of for (II.2) and the noisy case can be easily derived in the same manner. And is minimized when the energy of each column of is not concentrated on a few entries but spread across the whole column.
Our second theorem provides bounds on recovering the column-wise sparse matrix in the noisy case; these bounds show that the recovery error scales linearly with respect to the strength of the noise.
Theorem II.2**.**
(Noisy case) Consider the measurement model in equation (I.10), assume that at most coefficients are nonzero, and furthermore assume that the norm of the noise is bounded, . Suppose also that each modulation matrix satisfies the subspace constraint (I.2), where .
Then with probability at least , the solution to problem (I.11) satisfies
- •
if is a random Gaussian matrix,
[TABLE]
when
[TABLE]
where is a constant.
- •
if is a random Fourier matrix,
[TABLE]
when
[TABLE]
where and .
In both cases, is defined for . and are constant.
Although Theorem II.2 focuses exclusively on bounding the recovery error of the matrix , one can also attempt to estimate the parameters , , and from using (I.7). And according to Theorem II.2, for any and where and are the -th columns of the solution and the ground truth respectively, we would have with random Gaussian dictionary and for random Fourier dictionary. In addition, as results on structured matrix recovery from (possibly noisy) linear measurements, we believe that Theorems II.1 and II.2 may be of independent interest outside of the sparse recovery and blind demodulation problem.
III Proof of Theorem II.1
To begin our proof of the main theorem in the noiseless case, we first derive sufficient conditions for exact recovery.
III-A Sufficient Conditions for Exact Recovery
Sufficient conditions for exact recovery are the null space property and an alternative sufficient condition derived from the null space property. Similar sufficient conditions with complete proofs are available for minimization problems using other types of norms [9, 25, 26, 27]. However, since we cannot find sufficient conditions that suit our purpose and in order to be self-contained, we provide a short proof for the ones specific to the norm minimization problem in this section.
Proposition 2**.**
(The null space property) The matrix with support is the unique solution to the inverse problem (I.5) if
[TABLE]
for any in the nullspace of .
Proof.
Let be a solution to problem (I.5), with . To prove is the unique solution, it is sufficient to show that if . We start by observing that
[TABLE]
where and the first inequality comes from the fact that
[TABLE]
Therefore, as long as for any in the nullspace of , is the unique solution. ∎
Proposition 3**.**
The matrix with support is the unique solution to the inverse problem (I.5) if there exists and a matrix in the range space of such that
[TABLE]
and the operator satisfies ()
[TABLE]
Proof.
Proposition 2 shows that to establish uniqueness, it is sufficient to prove that for any in the nullspace of . Note that
[TABLE]
since . By applying the Hölder inequality, we get a stronger condition
[TABLE]
Since and , we have , and
[TABLE]
Plugging (III.3) into the stronger condition above yields
[TABLE]
Therefore, if , , and , the left hand side is positive. On the other hand, if , from (III.3), and . ∎
III-B Bounding The Isometry Constant and Operator Norm
In this section, we bound the isometry constant and operator norm appearing in (III.2) based on the randomness in the matrix . The isometry bound for the linear operator can be found in Lemma 4.3 in [9].
Lemma III.1**.**
[9, Lemma 4.3]** (Isometry) For the linear operator defined in (I.5) with and ,
[TABLE]
with probability at least where is the identity operator on the support such that ,
- •
if is a random Gaussian matrix and .
- •
if is a random Fourier matrix and .
Here is a constant that grows linearly with .
and can be viewed as constructed using , whose -th column is zero if , following (I.8). Therefore, has many zero columns and removing those zero columns results in . If , is invertible and according to Lemma A.12 in [26]. This property will be applied in (III.5) and Theorem III.1. To bound the operator norm of , we use results from [7] and [9].
Lemma III.2**.**
[7, 9]** For the linear operator defined in (I.5) with and ,
- •
if is a random Gaussian matrix,
[TABLE]
with probability at least .
- •
if is a random Fourier matrix,
[TABLE]
with probability at least when .
III-C Constructing The Dual Certificate for The Gaussian Case
In the case where is a random Gaussian matrix, we construct a certificate matrix that satisfies the conditions in Proposition 3. When , we can set
[TABLE]
where
[TABLE]
By construction, , and we need only to verify that .
Theorem III.1**.**
If , there exists in the range space of such that
[TABLE]
with probability at least when for .
Proof.
To simplify the notation, without loss of generality, we assume the support of is the first columns. Let be the dual certificate matrix defined in (III.4). After removing the columns of on support , we obtain which takes the form
[TABLE]
The columns of are independent of since is constructed with . Equivalently,
[TABLE]
Thus where is real and
[TABLE]
We set and have
[TABLE]
since each row of can be bounded by
[TABLE]
since we assume which implies . By generalizing Proposition 1 in [28] to our case, we have
[TABLE]
In addition, because is positive semi-definite and all its eigenvalues are non-negative, where is the -th eigenvalue of . where is the maximum singular value of . Therefore, for , we obtain
[TABLE]
If we pick , with probability at most . Taking the union over all non-zero columns of gives
[TABLE]
Therefore, with probability at least when . To make the probability meaningful, should be greater than .
∎
III-D Proof of Theorem II.1 for Random Gaussian Dictionary
In this section, we assemble the pieces to complete the proof of Theorem II.1 in the Gaussian case. To do so, we ensure that all sufficient conditions in Proposition 3 are met. First, if we take and set in Lemma III.1, we have
[TABLE]
when with probability at least . Then, applying the same in Lemma III.2 and setting , we have that with probability at least . In Theorem III.1, we have proved that and when with probability at least and .
Note that if , we have . Combining the above requirements on , all conditions in Proposition 3 are satisfied with probability at least when . Furthermore,
[TABLE]
if we set and , which yields the Theorem II.1 when is a random Gaussian matrix.
III-E Constructing The Dual Certificate for The Fourier Case
In this section, we construct a certificate that satisfies the inexact duality condition in Proposition 3 when is a random Fourier matrix. Specifically, we construct the dual certificate using the golfing scheme [29] which has been widely applied in compressive sensing [7, 25]. In the golfing scheme, a series of matrices in the range of are constructed iteratively. In each iteration step, only some of the measurements are utilized to ensure independence between iterations. And the constructed matrices will converge to on support while entries on are small. The goal is to find the conditions under which the final constructed matrix can serve as the certificate matrix.
According to Section (4.2.1) in [9], there exists a partition of the measurements into disjoint subsets such that each subset, , contains elements and
[TABLE]
where and . So
[TABLE]
Define and [math] on entries . . The golfing scheme iterates through
[TABLE]
Theorem III.2**.**
If is the ground truth solution to problem (I.5), there exists a matrix such that
[TABLE]
with probability at least for when
[TABLE]
and
[TABLE]
where a constant determined by .
Proof.
If we define , (III.7) gives
[TABLE]
where with [math] on entries and which are used to generate the sequence . And we can obtain
[TABLE]
with probability at least when with applying Lemma 4.6 in [9]. Therefore,
[TABLE]
To ensure that where is the final constructed dual certificate after iterations, we need
[TABLE]
We now turn to find the conditions such that . Note that substituting into equation (III.7) yields
[TABLE]
It is sufficient to show , where is the projection operator which projects a matrix on the support , to make because
[TABLE]
Defining to be with non-zero rows indexed by and zero otherwise, we have and for a vector where has support ,
[TABLE]
where is the column index and is the -th column of the identity matrix . In addition,
[TABLE]
Furthermore, we have because
[TABLE]
following since is the transpose of a random row of the DFT matrix and has support and on . Therefore, for , . Moreover,
[TABLE]
Because each entry of can be bounded by
[TABLE]
where the third equality holds because and has support T. The variance of is also bounded:
[TABLE]
because for each element of , we have
[TABLE]
and therefore
[TABLE]
As a result,
[TABLE]
The second inequality in (III.13) applies the inequality (III.6) and . We then apply the matrix Bernstein inequality from Theorem 1.6 in [30]. If we set and we know from (III.10) that , we obtain
[TABLE]
where , for a particular and . We then take the union over all and get
[TABLE]
To ensure for all , we obtain
[TABLE]
when using the same as in deriving equation (III.9). Setting , where is a constant comes from equation (III.6), gives us Theorem III.2. ∎
III-F Proof of Theorem II.1 for Random Fourier Dictionary
We now complete the proof of Theorem II.1 in the case when is a random Fourier matrix. First, combining the conditions and probabilities from Lemma III.1 and III.2, we know that the operator satisfies the inequalities and with probability at least when for some constant, , that grows linearly with .
Applying the same in Theorem III.2, the desired dual matrix exists with probability at least when . Merging the requirement on by setting and combining the probabilities, we complete the proof by applying Proposition 3.
IV Proof of Theorem II.2
To derive our recovery guarantee in the presence of measurement noise, the main ingredient of the proof is Theorem IV.1 which is a variation of the Theorem 4.33 in [26] from the infinity norm optimization to norm optimization problem.
Theorem IV.1**.**
Define and . Suppose the ground truth to (I.11) has non-zero columns with support and the measurement vector with . For and , assume that
[TABLE]
and that there exists a matrix such that
[TABLE]
If , then the minimizer, , to (I.11) satisfies
[TABLE]
where and are two constants depending on .
Proof.
Due to our assumption on the noise, is a feasible solution. Assume the final minimizer to (I.11) is , which implies
[TABLE]
where the second inequality comes from equation (III.1). Thus
[TABLE]
The last inequality comes from the Hölder inequality and our assumption , which tells us
[TABLE]
and
[TABLE]
Moreover, can also be bounded as follows.
[TABLE]
because ensures that and according to Lemma A.12 and Proposition A.15 in [26] respectively. Furthermore,
[TABLE]
in which is the -th column of . By setting , and substituting the inequality (IV.2) into (IV.1), we obtain
[TABLE]
Substituting inequality (IV.3) into (IV.2) yields
[TABLE]
Combining the above two inequalities, we obtain
[TABLE]
∎
Next, we specify the values of the variables , , and when is a random Gaussian and Fourier matrix. The Orlicz-1 norm [7] and associated matrix Bernstein inequality are needed for determining the value of when is Gaussian. Specifically, the Orlicz-1 norm is defined as [7]
[TABLE]
Its associated matrix Bernstein inequality is provided in Proposition 3 in [7] which can be rewritten as
Proposition 4**.**
Let be independent random matrices with . Suppose
[TABLE]
and define
[TABLE]
Then there exists a constant such that for
[TABLE]
The following theorem utilizes the Proposition 4 and depicts the conditions under which .
Theorem IV.2**.**
For defined in (I.8) and ,
[TABLE]
with probability at least
- •
if is a random Gaussian matrix and
[TABLE]
- •
if is a random Fourier matrix and
[TABLE]
where is a constant that grows linearly with and is a constant.
Proof.
We first prove the Gaussian case; the Fourier case is very similar. Note that for an arbitrary
[TABLE]
where is but only contains values in the -th to -th columns and is zero otherwise. can also be viewed as an extension of by padding zero columns. Moreover, is the conjugate of the -th column of who has only one non-zero value in the -th entry. In addition, for . By applying the property of the Kronecker product, we estimate the spectral norm of which can be used to determine its Orlicz-1 norm:
[TABLE]
in which contains non-zero values in the entries indexed by . Therefore, follows the Chi-squared distribution with degrees of freedom which implies that for some constant according to the proof of Lemma 4.7 in [9] and the definition of Orlicz-1 norm in (IV.5). Moreover,
[TABLE]
following from the fact that for all and from the assumption. On the other hand,
[TABLE]
Therefore, . Substituting the variables and into Proposition 4 and taking the union bound over all results in
[TABLE]
Define a variable and set
[TABLE]
where . Simplifying the probability term gives
[TABLE]
Following the same procedures, when is a random Fourier matrix and for any , we have , and . The matrix Bernstein inequality implies
[TABLE]
Similarly, if we define a variable and let
[TABLE]
by setting , simplifying the probability gives us
[TABLE]
∎
IV-A Proof of Theorem II.2 for Random Gaussian Dictionary
According to Section III-D, , and with probability at least when . Moreover, in Theorem III.1, where we construct the dual certificate matrix when is a random Gaussian matrix, we define and leads to . So
[TABLE]
which implies . If we use the same in Theorem IV.2, we have with probability at least when
[TABLE]
Combining the requirement on and setting yield
[TABLE]
Therefore, the conditions in Theorem IV.1 are satisfied with probability at least when is as defined in equation (IV.6). In addition, after substituting the parameters and into (IV.4), and .
IV-B Proof of Theorem II.2 for Random Fourier Dictionary
In the proof of Theorem III.2, we have derived . Since the sets are disjoint, the indices of non-zero entries of for different are disjoint and . Moreover, has support from its definition in (III.8) which gives us
[TABLE]
because and following from Lemma 4.6 in [9] and equation (III.10) respectively. is constructed with and only rows indexed by are non-zero. Therefore, and with defined in equation (III.11). In addition, from Section III-F and Theorem II.1, we have , and with probability at least when
[TABLE]
Applying the same to Theorem IV.2, with probability at least when . One can easily examine that .
If we set and merge the requirements on , we obtain
[TABLE]
Thus, the conditions in Theorem IV.1 are satisfied with probability at least when satisfies (IV.7). Moreover, since , and with .
V Numerical Simulations
Here we present numerical simulations that illustrate and support our theoretical results. We set to be the first columns of the normalized DFT matrix . The parameters and are generated by sampling independently from the standard normal distribution, and the non-zero columns of the ground truth solution are selected uniformly. 40 simulations are run for each setting, based on which we compute the percentage of successful recovery. Both the dictionary, , and the ground truth solution, , including the support and its content, are sampled independently for each simulation. We solve problems (I.5) and (I.11) via CVX [31], and in the noiseless case if the relative error between the solution and the ground truth is smaller than , , we count it as a successful recovery.
V-A * The Sufficient Number of Measurement*
In the first noiseless simulation, we examine the recovery rate with respect to the parameters and . We fix and and let and range from to . The results are summarized in the phase transition plots of Fig. 1 for the random Gaussian dictionary and Fig. 2 for the random Fourier dictionary. The results for the two dictionaries are similar. The reciprocal nature of the phase transition boundary supports the linear scaling with in equations (II.1) and (II.2). Roughly when , the recovery success rate is satisfactory.
To further illustrate the linear scaling of the required number of measurements with respect to and , we fix and , and let and range from to and to , respectively. The results are recorded in Figs. 3 and 4 for the random Gaussian and Fourier dictionaries, respectively. The same simulation but switching the roles of and is also implemented, and the results are shown in Figs. 5 and 6. These results support the linear scaling of Theorem II.1.
V-B The Recovery Error Bound with Noisy Measurement
To test the noisy case, we set , , and , and we let with . Theorem II.2 gives a recovery guarantee of the form for a constant . Therefore, after dividing both sides by , setting and changing the units to decibels (dB), we obtain
[TABLE]
We call the relative error in dB and the noise-to-signal ratio in dB. To examine the linear relation between the relative error and the noise-to-signal ratio in equation (V.1), we sample the real and complex components of the noise vector independently from a standard Gaussian distribution and scale to attain different noise-to-signal ratios. Similar to the previous plots, 40 independent simulations are run for each noise-to-signal ratio and the range of the standard deviation and mean (computed before transforming to dB) of the relative error in dB are recorded in Figs. 7 and 8. The dashed lines show the theoretical error bound from Theorem II.2 which are drawn by substituting the constants derived in Section IV-A and IV-B and the system parameters into equations (II.3) and (II.5). The slope of each dashed line are 1. We observe that when noise-to-signal ratio is smaller than 0 dB, the relative error scales linearly with respect to the noise-to-signal ratio with slope 1 for both random Gaussian and Fourier dictionaries. This confirms that grows linearly with respect to in Theorem II.2. Moreover, if the noise dominates the observed signal, solving the problem (I.11) results in and the relative error becomes 0 dB.
V-C Direction of Arrival Estimation
In this section, we apply the proposed signal model to the direction of arrival estimation problem introduced in Section I-D1. Note that there exits thousands of different subspaces that the complex calibration could live in. To give a concrete example and compare to the related work, we adopt the setting from [9] where the calibration subspace is modeled by the first columns of the normalized DFT matrix . The entries of are sampled independently from the standard normal distribution and is normalized to have unit norm. Moreover, we set and discretize the direction of arrival into degrees. When the distance between array elements is half of the operating wavelength, we can obtain by substituting and into defined in Section I-D1. Furthermore, we set and where the directions of arrival of the 5 sources are degrees and the signal magnitudes are sampled independently from the uniform distribution on . The real and imaginary parts of the noise vector are independent random Gaussian vectors with [math] mean and identity covariance matrix. dB. By solving the norm minimization problem in (I.11), the index of the nonzero column in the solution indicates the direction of arrival and the norm of the nonzero column indicates the signal strength. The result is recorded in Fig .9 (a). As a comparison, we also apply the Sparselift method proposed in [9] to this problem, which assumes for all are the same and solves an norm minimization problem. The result is recorded in Fig .9 (b).
V-D Single Molecule Imaging
Furthermore, we apply the proposed signal model to the single molecule imaging described in Section I-D2. All data comes from the Single-Molecule Localization Microscopy grand challenge organized by ISBI [32] which contains 12,000 low-resolution frames. Each low-resolution frame is 64 pixel 64 pixel with pixel size 100 nm 100 nm, so that . A typical, observed frame is shown in Fig. 10 (a). Superimposing all the observed frames leads to the low-resolution structure in Fig. 10 (b). The target of this experiment is to recover the high resolution image of size 320 pixel 320 pixel, which implies that , whose pixel is of size 20 nm 20 nm. In addition, according to the statistic of the dataset, the number of activated fluorophores in each frame is less or equal to and we use the Gaussian point spread functions to approximate the point spread functions of the microscope. By implementing the SVD on the Gaussian point spread functions with different variances, we obtain a dimension subspace that point spread functions live in. Then by solving an norm regularized least square minimization problem on each low-resolution frame, we get totally 12,000 high resolution images and superimposing all the high resolution images results in the super-resolution output recorded in Fig. 10 (c).
VI Conclusion
In this paper, we introduce the generalized sparse recovery and blind demodulation model and achieve sparse recovery and blind demodulation simultaneously. Under the assumption that the modulating waveforms live in a known common subspace, we employ the lifting technique and recast this problem as the recovery of a column-wise sparse matrix from structured linear measurements. In this framework, we accomplish sparse recovery and blind demodulation simultaneously by minimizing the induced atomic norm, which in this problem corresponds to norm minimization. In the noiseless case, we derive near optimal sampling complexity that is proportional to the number of degrees of freedom, and in the noisy case we bound the recovery error of the structured matrix. Numerical simulations support our theoretical results. In addition to extending the class of dictionaries we have considered, an interesting future direction would be to relax the constraint that each is diagonal while preserving the low-dimensional subspace assumption.
Acknowledgment
This work was supported by NSF grant CCF.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Y. Xie, M. B. Wakin, and G. Tang, “Sparse recovery and non-stationary blind demodulation,” in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 5566–5570, IEEE, 2019.
- 2[2] M. Lustig, D. L. Donoho, J. M. Santos, and J. M. Pauly, “Compressed sensing mri,” IEEE Signal Processing Magazine , vol. 25, no. 2, pp. 72–82, 2008.
- 3[3] F. J. Herrmann and G. Hennenfent, “Non-parametric seismic data recovery with curvelet frames,” Geophysical Journal International , vol. 173, no. 1, pp. 233–248, 2008.
- 4[4] S. Pudlewski and T. Melodia, “On the performance of compressive video streaming for wireless multimedia sensor networks,” in 2010 IEEE International Conference on Communications (ICC) , pp. 1–5, IEEE, 2010.
- 5[5] Y. Zhang, M. Roughan, W. Willinger, and L. Qiu, “Spatio-temporal compressive sensing and internet traffic matrices,” in ACM SIGCOMM Computer Communication Review , vol. 39, pp. 267–278, ACM, 2009.
- 6[6] A. Goldsmith, Wireless communications . Cambridge university press, 2005.
- 7[7] A. Ahmed, B. Recht, and J. Romberg, “Blind deconvolution using convex programming,” IEEE Transactions on Information Theory , vol. 60, no. 3, pp. 1711–1732, 2014.
- 8[8] D. Yang, G. Tang, and M. B. Wakin, “Super-resolution of complex exponentials from modulations with unknown waveforms,” IEEE Transactions on Information Theory , vol. 62, no. 10, pp. 5809–5830, 2016.
