Low-Complexity Blind Parameter Estimation in Wireless Systems with Noisy Sparse Signals
Alexandra Gallyas-Sanhueza, Christoph Studer

TL;DR
This paper introduces low-complexity blind estimators for noise power, signal power, SNR, and MSE in wireless systems with sparse signals, enabling improved parameter tracking and signal recovery without additional pilot overhead.
Contribution
It proposes novel blind estimators leveraging data sparsity, with theoretical analysis and practical applications in millimeter-wave and cell-free wireless systems.
Findings
Estimators accurately track system parameters in noisy, sparse data environments.
Application examples show improved channel estimation accuracy.
Estimators operate with low computational complexity.
Abstract
Baseband processing algorithms often require knowledge of the noise power, signal power, or signal-to-noise ratio (SNR). In practice, these parameters are typically unknown and must be estimated. Furthermore, the mean-square error (MSE) is a desirable metric to be minimized in a variety of estimation and signal recovery algorithms. However, the MSE cannot directly be used as it depends on the true signal that is generally unknown to the estimator. In this paper, we propose novel blind estimators for the average noise power, average receive signal power, SNR, and MSE. The proposed estimators can be computed at low complexity and solely rely on the large-dimensional and sparse nature of the processed data. Our estimators can be used (i) to quickly track some of the key system parameters while avoiding additional pilot overhead, (ii) to design low-complexity nonparametric algorithms that…
| Complexity | Accuracy | |||
|---|---|---|---|---|
| Power estimation | Denoising | Synthetic data | Realistic channels | |
| Baseline EM | (✓✓✓) | (✓✓) | ||
| Accelerated EM | (✓✓✓) | (✓✓) | ||
| Nonparametric | (✓) | (✓✓✓) | ||
| Parametric | (✓✓) | (✓✓✓) | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlind Source Separation Techniques · Advanced Adaptive Filtering Techniques · Direction-of-Arrival Estimation Techniques
\frefformat
vario\fancyrefseclabelprefixSection #1 \frefformatvario\fancyreffiglabelprefixFigure #1 \frefformatvariothmTheorem #1 \frefformatvariocorCorollary #1 \frefformatvarioremRemark #1 \frefformatvariolemLemma #1 \frefformatvarioappAppendix #1 \frefformatvariodefDefinition #1 \frefformatvarioalgAlgorithm #1 \frefformatvariotblTable #1 \frefformatvarioestEstimator #1 \frefformatvariosysSystem Model #1 \frefformatvario\fancyrefeqlabelprefix(#1)
Low-Complexity Blind Parameter Estimation
in Wireless Systems with Noisy Sparse Signals
Alexandra Gallyas-Sanhueza and Christoph Studer A. Gallyas-Sanhueza is with the School of Electrical and Computer Engineering, Cornell University, Ithaca, NY; email: [email protected]. Studer is with the Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich, Switzerland; email: [email protected] work of AGS and CS was supported in part by ComSenTer, one of six centers in JUMP, a Semiconductor Research Corporation (SRC) program sponsored by DARPA, and in part by the US National Science Foundation (NSF) under grants CNS-1717559 and ECCS-1824379.Part of this work was presented at the IEEE International Conference on Communications (ICC) 2021 [1]. This journal paper extends our work by (i) including a novel parametric noise power estimator with improved accuracy, (ii) evaluating the proposed blind estimators as an initializer for an expectation-maximization algorithm, and (iii) adding two applications examples.MATLAB code to reproduce our simulations is available on GitHub: https://github.com/IIP-Group/blind_and_nonparametric_estimators.The authors thank Arian Maleki, Ramina Ghods, Charles Jeon, and Seyed Hadi Mirfarshbafan for discussions on signal recovery using SURE, and Haochuan Song for sharing the cell-free system simulator from [2].
Abstract
Baseband processing algorithms often require knowledge of the noise power, signal power, or signal-to-noise ratio (SNR). In practice, these parameters are typically unknown and must be estimated. Furthermore, the mean-square error (MSE) is a desirable metric to be minimized in a variety of estimation and signal recovery algorithms. However, the MSE cannot directly be used as it depends on the true signal that is generally unknown to the estimator. In this paper, we propose novel blind estimators for the average noise power, average receive signal power, SNR, and MSE. The proposed estimators can be computed at low complexity and solely rely on the large-dimensional and sparse nature of the processed data. Our estimators can be used (i) to quickly track some of the key system parameters while avoiding additional pilot overhead, (ii) to design low-complexity nonparametric algorithms that require such quantities, and (iii) to accelerate more sophisticated estimation or recovery algorithms. We conduct a theoretical analysis of the proposed estimators for a Bernoulli complex Gaussian (BCG) prior, and we demonstrate their efficacy via synthetic experiments. We also provide three application examples that deviate from the BCG prior in millimeter-wave multi-antenna and cell-free wireless systems for which we develop nonparametric denoising algorithms that improve channel-estimation accuracy with a performance comparable to denoisers that assume perfect knowledge of the system parameters.
I Introduction
Accurate knowledge of system parameters, such as the average noise power, average signal power, and/or signal-to-noise ratio (SNR), is critical in wireless communication systems, as many baseband processing tasks rely on these quantities [3]. Virtually all existing wireless systems dedicate training phases to estimate such parameters. These training phases typically consist of sending pilots: signals that are known to the receiver and enable estimation of the desired parameters. As pilots do not convey information, minimizing the pilot overhead is desirable in practice. Furthermore, parameter estimation in wireless systems operating at millimeter-wave (mmWave) frequencies must be done frequently, since the propagation conditions can change at fast rates, e.g., blockers or interferers may appear or disappear quickly [4]. Thus, it is even more important to reduce the pilot overhead. In addition, such systems are expected to support several GHz of bandwidth and basestations will consist of a large number of antenna elements. It is therefore important to develop low-complexity solutions that quickly and accurately track such parameters for high-dimensional problems that must be processed at fast rates.
From a parameter estimation perspective, it is beneficial that many modern wireless communication systems often deal with high-dimensional data. For example, all-digital massive multiple-input multiple-output (MIMO) basestations are expected to be equipped with hundreds of antennas [5] or orthogonal frequency-division multiplexing (OFDM) systems will support thousands of subcarriers [6]. Since many of these high-dimensional signals arising in such systems exhibit structure (e.g., are sparse or are taken from a discrete set), one can design statistical methods that blindly estimate critical parameters without requiring a dedicated training phase.
In this paper, we focus on noisy observations of signal vectors that are sparse, i.e., only few entries carry most of the signals’ energy. Examples of sparse vectors in wireless systems include (i) the beamspace-domain representation of all-digital mmWave multi-antenna channel vectors [7, 8, 9], (ii) the delay-domain representation of OFDM channel vectors [10], and (iii) the antenna-domain representation of channel vectors in cell-free MIMO wireless systems [11]. We will explain how sparsity can be exploited to estimate parameters and denoise noisy observations of sparse vectors. In Sections II, III, and IV, we decouple our results from wireless communication applications and study the general setting. In \frefsec:channel_denoising, we apply our estimators and algorithms to three distinct applications in wireless systems.
In what follows, we will use the term “blind” for estimators that do not use any pilots or training sequences and instead rely only on the signal statistics; blind estimators may have tuning parameters. We will use the term “nonparametric” for estimators that do not need knowledge of system parameters and do not have parameters that need to be tuned manually; nonparametric estimators may use pilots or training sequences.
I-A Prior Art in Blind and Nonparametric Estimation
Many of the existing blind noise power and SNR estimators exploit modulation-specific structure, such as the cyclic prefix redundancy in OFDM [12, 13], or the periodicity of synchronization sequences [14]. Expectation-maximization (EM) has also been used for blind noise power or SNR estimation [15], and for joint sparse signal recovery and noise power estimation [16, 17]. However, the iterative nature of Bayesian algorithms and EM, and their relatively high per-iteration complexity renders such methods unsuitable for real-time estimation in wireless systems that operate with high-dimensional data at gigabit-per-second sampling rates. In contrast, we propose low-complexity blind estimators whose complexity only scales with , where is the dimension of the processed data. Our proposed low-complexity estimators can also be used as an initialization point to accelerate the convergence of existing EM algorithms.
Joint noise power estimation and sparse signal recovery was investigated in [18]; these methods require the choice of algorithm parameters, which affect the estimation accuracy and robustness. A parameter-free version of sparse signal recovery that combines approximate message passing (AMP) [19, 20] with Stein’s unbiased risk estimate (SURE) [21, 22] was proposed in [23]. Similarly, the nonparametric equalizer (NOPE) [24] combines AMP with SURE to perform linear minimum mean-square error (MSE) equalization in massive MIMO systems without knowledge of the SNR. A drawback of such algorithms is the high per-iteration complexity, which prevents their use in wireless systems supporting large bandwidths and high-dimensional problems (see, e.g., [25, 26] for hardware results of sparse signal recovery). We therefore focus on low complexity, blind, and nonparametric algorithms for the fully-determined setting (in contrast to compressive sensing where one has fewer measurements than unknowns), which finds use in many practical situations. For example, all-digital massive MIMO architectures (which can be as energy efficient as hybrid analog-digital architectures [27, 28, 29]) and cell-free wireless systems can provide measurement vectors of the same dimension as the sparse signal. In OFDM systems, even though pilots are typically transmitted only on a subset of all subcarriers, interpolation and extrapolation algorithms can be used to extract channel state information on all subcarriers [30]; this also leads to the fully-determined setting that enables the use of our methods.
In this low-complexity setting, the concept of estimating tuning parameters directly from the noisy observations has been used recently for adaptive denoising of mmWave [7, 8, 9] or OFDM [10] channel vectors. Such denoising algorithms typically require a tuning parameter: the denoising threshold. While SURE can be used to automatically determine the MSE-optimal denoising threshold, it still requires knowledge of the noise power. In contrast to such results, we propose low-complexity blind estimators, which enable the design of nonparametric (i.e., parameter free) channel-vector denoising algorithms that deliver comparable performance to methods that assume perfect knowledge of the required parameters (e.g., the noise power).
Blind nonparametric algorithms have been proposed for denoising of real-valued signals. The authors in [31] have used power estimation methods based on the median absolute deviation (MAD) of real-valued signals for wavelet denoising. The Python wavelet toolbox PyYAWT [32] includes MAD-based power estimation and adaptive wavelet denoising using SURE for real-valued signals. Our methods also build upon MAD and SURE, but are suitable for complex-valued signals. In addition, we provide a detailed derivation and a theoretical analysis, and extend the general concept to estimate other quantities that frequently arise in wireless systems. While some papers apply real-valued MAD for noise power estimation in the complex-valued setting (see, e.g., [33] for magnetic resonance imaging), there are non-negligible differences to the complex case. We therefore derive the complex-valued version, provide a theoretical accuracy analysis with a Bernoulli complex Gaussian (BCG) prior, and show application examples that deviate from this prior in order to highlight robustness and usefulness of our results.
I-B Contributions
A variety of applications in communication systems deal with sparse and complex-valued signals whose observations are contaminated with noise. For such a model, we propose novel low-complexity blind estimators for the average noise power, average signal power, and SNR. In addition, we propose a blind estimator for the MSE of an estimation function that aims to recover the sparse signal. We use this blind MSE estimate to design a novel nonparametric channel-vector denoising algorithm. We conduct a theoretical analysis of our estimators for a BCG prior, and we showcase simulation results with synthetic data in order to demonstrate the efficacy and limits of our estimators in finite dimensions. In order to demonstrate the efficacy of our results in situations that deviate from a BCG prior, we provide three application examples of channel-vector denoising in mmWave and cell-free communication systems. We also show that our low-complexity estimators can be used to accelerate the convergence (and, hence, reduce the complexity) of existing estimators with a concrete example of an EM-based algorithm.
I-C Notation
Lowercase and uppercase boldface letters denote column vectors and matrices, respectively. The th entry of the vector is ; the real and imaginary parts are and , respectively. We use to refer to for . For , the vector -norm is defined as for with and the -pseudo-norm counts the number of nonzero entries in . The identity matrix is and the all-zeros vector is . The discrete Fourier transform matrix is denoted by and satisfies , where the superscript denotes the Hermitian (conjugate transpose) matrix. An i.i.d. circularly-symmetric complex Gaussian random vector with variance per complex dimension is denoted by and its probability density function (PDF) evaluated at is . Sample estimates are denoted by a bar, e.g., the sample variance of the random vector ; statistical quantities are denoted by plain symbols, e.g., the variance , where denotes expectation; blind estimators are denoted by a hat, e.g., . For , rounding towards plus and minus infinity is denoted by and , respectively, and . Convergence in probability of a random sequence to a random variable is A_{n}\xrightarrow[{\raisebox{3.0pt}[0.0pt][0.0pt]{\scriptstyle{n\to\infty}}}]{{\raisebox{-0.5pt}[0.0pt][0.0pt]{\scriptstyle{prob.}}}}A and almost sure convergence is A_{n}\xrightarrow[{\raisebox{3.0pt}[0.0pt][0.0pt]{\scriptstyle{n\to\infty}}}]{{\raisebox{-0.5pt}[0.0pt][0.0pt]{\scriptstyle{a.s.}}}}A.
II Practical Guide to Low-Complexity Blind Estimators
We now introduce two system models and propose low-complexity blind estimators for the average noise and signal powers, SNR, and MSE. The derivation of the proposed estimators and an analysis of the key properties are provided in \frefsec:theory.
II-A System Models
We say that a complex-valued vector is sparse if the number of nonzero entries is smaller than the dimension . As a sparsity measure, one can use, for example, the -pseudo-norm . This definition of sparsity allows us to derive theoretical results, but in practice, our algorithms also work for approximately sparse signals in which most entries are small compared to the noise (but not necessarily zero). We will focus on the following two system models.
System Model 1**.**
Let be a sparse signal with average power . We model the input-output relation of a noisy observation of the sparse signal as
[TABLE]
where is the noisy observation and models noise with . We assume that the sparse signal vector and noise vector are statistically independent.
\fref
sys:systemmodel1 finds numerous applications in wireless communication systems. Prime examples are in describing estimated channel vectors (i) in multi-antenna mmWave systems, where the beamspace-domain representation of the channel vectors is typically sparse [7, 8, 9], (ii) in OFDM systems, where the delay-domain representation of the channel vectors is typically sparse [10], or (iii) in cell-free communication systems with centralized processing, where the antenna-domain representation of the channel vectors is typically sparse [11]. In what follows, we assume the sparse vector is unknown (in contrast to pilot-based estimation), which makes parameter estimation nontrivial in this blind scenario.
System Model 2**.**
Let be a noisy observation as in \frefsys:systemmodel1. Fix a weakly differentiable function111A weakly differentiable function may be nondifferentiable only in zero-measure sets (e.g., for particular values), and has to be differentiable everywhere else. that operates entry-wise on vectors. We model the output after applying this function to the noisy observation as
[TABLE]
where contains (likely non-Gaussian) residual distortion. We emphasize that the sparse signal vector and the residual distortion vector are not necessarily statistically independent.
\fref
sys:systemmodel2 is relevant in the following scenarios: (i) Estimating a sparse signal from a noisy observation by applying an entry-wise denoising or estimation function, producing the signal estimate ; this scenario finds use for channel-vector denoising [7, 8, 9]. (ii) Modeling nonlinearities caused by hardware impairments [34], in which case the distorted version of the noisy received signal can be expressed as ; this scenario finds use in signals sampled with low-resolution data converters [35, 36], for example.
II-B Low-Complexity Blind Nonparametric Estimators
In what follows, we make use of the sample median, which we define as follows.
Definition 1** (Sample Median).**
Let be a real-valued vector and be its sorted version (entries sorted in ascending order). Then, the sample median is defined as
[TABLE]
The sample median is robust to outliers [37, 38], which makes it amenable to \frefsys:systemmodel1, as the nonzero entries of the sparse vector can be considered to be outliers for the purpose of separating the sparse signal from noise. We emphasize that the sample median can be computed at a complexity of average time using quickselect [39] or of deterministic time using the MedianOfNinthers algorithm [40].
We now propose a range of low-complexity blind estimators (no pilots required) for complex-valued signals that require no parameters.
Estimator 1** (Average Noise Power).**
Consider \frefsys:systemmodel1. We propose the following blind estimator
[TABLE]
for the average noise power defined as .
\fref
est:noisevariance is blind as it only requires the absolute square entries of the noisy observation in \frefeq:inputoutputrelation. The estimate can be computed efficiently in time, since the most complex operation is computing the median of a vector of dimension . \frefest:noisevariance exploits sparsity in the signal , but is independent of the signal sparsity, the signal power, or the statistical sparsity model. It is, however, important to understand that the accuracy of this estimator depends on all of these factors as it relies on the fact that the nonzero entries of the sparse vector can be treated as outliers for the purpose of estimating the average noise power. We note that this noise power estimator can be seen as a complex-valued and squared version222The squared median absolute deviation (MAD) estimator for real-valued signals provided in [38] corresponds to whereas we propose to use . While if is even, both estimators coincide if is odd. What is more, our scaling factor differs considerably from the widely-used scaling factor of for real-valued signals [31]. We reiterate that the latter is derived for power estimation of real-valued Gaussians using the MAD estimator, while in our derivation we consider the case of complex-valued Gaussians. of the median absolute deviation (MAD) estimator [37, 41], where we use the assumption that the noise in \frefsys:systemmodel1 is zero mean. The intuition behind this estimator (and the factor) is the fact that the entries , are distributed with two degrees of freedom, which have a median of , and that the median of is not significantly “contaminated” by the sparse signal. \frefest:noisevariance is used in the estimators proposed next.
Estimator 2** (Average Signal Power).**
Consider \frefsys:systemmodel1. We propose the following blind estimator
[TABLE]
for the average signal power defined as .
\fref
est:signalpower is blind as it only requires the sample estimate of the receive power and the blind noise estimate from \frefest:noisevariance. can be computed efficiently in time, since the most complex operation is computing . The intuition behind this estimator comes from subtracting the estimated noise power from the total receive power, as done previously in [13] for an OFDM-specific estimator.
Estimator 3** (Signal-to-Noise Ratio).**
Consider \frefsys:systemmodel1. We propose the following blind estimator
[TABLE]
for the SNR defined as .
\fref
est:SNR is blind as it only requires the sample estimate of the receive power and the blind estimate from \frefest:noisevariance. can also be computed efficiently in time. The intuition behind this estimator comes from dividing the estimated signal and noise powers, as done previously in [13] for an OFDM-specific estimator.
Estimator 4** (Mean-Square Error).**
Consider \frefsys:systemmodel2 with a fixed function . We propose the following blind estimator
[TABLE]
for the MSE defined as .
\fref
est:MSE is blind as it only requires the receive signal , the blind estimate from \frefest:noisevariance, and the function . The complexity of the proposed MSE estimator depends on the function . For example, \frefeq:MSEnonparametricexplicitform can be computed efficiently in time for the soft-thresholding function with a given threshold. Even if the threshold is not given, searching for the best threshold and applying the soft-thresholding function can be done in time using the methods developed for the BEACHES algorithm in [7]. The MSE is a frequently used metric to evaluate the performance of estimation algorithms. Our blind MSE estimate, since it is independent of , can be used to automatically tune parameters in estimators. The intuition behind this estimator relies on SURE, and we refer the interested reader to [22] for an accessible derivation in the real-valued case and to [7, 8] for a derivation in the complex-valued case. \frefest:MSE is used to obtain the nonparametric channel-vector denoising algorithm described in \frefsec:channel_denoising.
II-C Low-Complexity Blind Parametric Estimators
We now propose a low-complexity blind estimator (no pilots or training sequences required) that takes an estimate of the activity rate as a parameter. We then propose a family of parametric estimators for the activity rate.
Estimator 5** (Average Noise Power with Estimated SNR and Activity Rate Corrections).**
Consider \frefsys:systemmodel1, the low-complexity blind estimates from \frefest:SNR, and a parameter that is an estimate of the activity rate . We propose the following blind parametric estimator
[TABLE]
for the average noise power .
\fref
est:sandwich is blind as it only requires the blind estimates , , but is parametric as it depends on the activity rate estimate . can be computed efficiently in time, since the most complex operations are computing and , and eventually (but here we consider as a given parameter and ignore the complexity associated with obtaining it). The intuition behind this estimator will become clear after we present \frefthm:mainresult, as it is derived from averaging a lower and an upper bound on . As shown in \frefsec:synthetic_results, this parametric estimate often yields better accuracy than the nonparametric estimate .
Since in some applications an estimate for may be unavailable, we next propose a family of estimators that attempt to extract the activity rate directly from the noisy observation vector . Such estimators can, for example, be used to substitute in \frefeq:sandwich_estimator.
Estimator 6** (Activity Rate).**
Consider \frefsys:systemmodel1 and integers . We propose the following family of blind parametric estimators333In practice, we use in place of , so that \log\!\Big{(}\frac{2-2\hat{p}(q,r)}{1-2\hat{p}(q,r)}\Big{)} is always well-defined, as required by \frefeq:probabilitycondition.
[TABLE]
for the activity rate444We define the activity rate as the fraction of nonzero entries of the vector . Values of close to [math] indicate the vector is sparse and indicates that all entries are nonzero. defined as .
\fref
est:p is blind as it only requires the receive vector , but is parametric as it requires a choice for and . can be computed efficiently in time and, among others, the following choices for and require low complexity: , , , and . The parameters and must be chosen according to simulations, as we are unaware of a principled and reliable way to determine them. In our simulations, the choice performed best.
II-D Blind Parametric Estimator Based on Expectation-Maximization (EM)
As a baseline, we also consider a blind EM estimator (no pilots required) that requires initialization values and algorithm parameters that determine the convergence criterion.
Estimator 7** (Noise Power, Signal Power, and Activity Rate).**
Consider \frefsys:systemmodel1. \frefalg:EM, initialized with and , simultaneously estimates the noise power , the signal power , and the activity rate .
\fref
est:EM is blind as it only requires the noisy observation , but is parametric as it needs a choice for the maximum number of iterations , the tolerance , and initialization values for the noise power and activity rate . The total number of EM iterations is not fixed but depends on , , , , and on the input itself. The complexity of \frefest:EM is . We note that this estimator is a variant of a classical EM algorithm for a two-component Gaussian mixture [42], where we use the assumption that the signal and the noise in \frefsys:systemmodel1 are zero mean and complex valued. The intuition behind this estimator is the fact that each entry , , of vector contains either noise or signal-plus-noise, and those two cases have Gaussian distribution with different variances.
We note that this baseline EM algorithm is only a minor variation of the method in [42, Alg. 8.1]. The iterative nature of such methods, however, results in (often significantly) higher complexity than our estimators. With this in mind, we propose an improved version that we call “accelerated EM,” which simply consists of initializing the baseline EM algorithm using our blind nonparametric noise variance estimator. As we will see in \frefsec:accelerated_convergence, this accelerated EM variant drastically reduces the number of iterations needed for convergence without degrading accuracy.
II-E Summary of Proposed Power Estimation and Denoising Algorithms
\fref
tbl:comparison summarizes the complexity and accuracy of the different estimators. “Baseline EM” refers to \frefest:EM, “accelerated EM” to \frefest:EM initialized using \frefest:noisevariance, “nonparametric” to \frefest:noisevariance, and “parametric” to \frefest:sandwich. The complexity for blind noise power estimation is mentioned below the definition of each of these algorithms. The complexity for denoising is the complexity of estimating the noise power plus the complexity of the BEACHES algorithm from [8]. Since BEACHES already sorts the magnitudes of the noisy signal, the nonparametric and parametric estimators that use the median require no additional complexity for estimating the noise power. Anticipating the results shown in \frefsec:synthetic_results and \frefsec:channel_denoising, we illustrate (qualitatively) the accuracy of the estimators with synthetic data that perfectly matches the BCG prior, and with practical examples that deviate from this prior.
III Theory
We first show that the sample median approaches the median for and introduce our statistical model for sparse vectors. We then derive and analyze Estimators 1 to 7. The observations made in this section are valid in the large-dimension limit and for the noisy BCG model to be introduced in \frefdef:noisyBCG. We use simulations to demonstrate the accuracy of our estimators for finite (and small) dimensions with the noisy BCG model in \frefsec:synthetic_results. To demonstrate the efficacy of our methods in practical scenarios with signals that deviate from the BCG model, we evaluate our denoising algorithms in three distinct scenarios in \frefsec:channel_denoising.
III-A Convergence of the Sample Median for
We will use the following definition of the median.
Definition 2** (Median).**
Let be an absolutely continuous random variable (RV) with cumulative distribution function (CDF) . Then, the median of is defined as
[TABLE]
While, analogously to the central limit theorem, the sample median is approximately Gaussian if is large (see, e.g., [43]), we will only use the following result.
Lemma 1** (Lem. C.1 from [43]).**
Let be a RV whose PDF is differentiable in some neighborhood of the median and vector contain i.i.d. samples of . Then, for any the sample median satisfies
[TABLE]
This result implies that in the large-dimension limit (), the sample median converges in probability to the median . Hence, by observing a sufficiently large number of samples, which is possible in modern multi-antenna mmWave or OFDM systems, we can accurately estimate the median .
III-B Statistical Model for Complex-Valued Sparse Vectors
To derive and analyze the blind estimators proposed in \frefsec:nonparametricestimators, we need a statistical model for the sparse signal . This model should (i) have as few parameters as possible while being able to model a large class of complex-valued sparse vectors typically arising in communication systems and (ii) facilitate a theoretical analysis. In what follows, we consider BCG random vectors [44, 20], which allow control over the signal sparsity and the signal power. We reiterate that the BCG model is instrumental only for our analysis. The provided simulation results in \frefsec:channel_denoising will show that the proposed estimators exhibit robustness to model mismatch, e.g., for signals that are not necessarily i.i.d. Gaussian or circularly symmetric.
Definition 3** (BCG Random Vector).**
A sparse vector is BCG if each entry is nonzero with probability , and the nonzero entries are i.i.d. circularly-symmetric complex Gaussian with variance . The PDF of each entry , , is therefore given by
[TABLE]
where is the Dirac delta distribution.
With this model, the activity rate is (meaning the expected number of nonzero entries is ), and the average power of the sparse signal vector is .
In \frefsys:systemmodel1, we assumed that the noise vector is i.i.d. circularly-symmetric complex Gaussian with variance per complex entry. Hence, the PDF of each entry , , is given by . Consequently, if is a BCG random vector, then the PDF of the noisy observation vector is as follows.
Definition 4** (Noisy BCG Random Vector).**
The PDF of the entries , , of a BCG random vector per \frefdef:BCG observed as in \frefsys:systemmodel1 is given by
[TABLE]
For this signal and observation model, we are now able to derive and analyze Estimators 1 to 7. We will make frequent use of the entry-wise square of vector that we will call . We also define a random variable (RV) with the same distribution as any of the i.i.d entries of , and let be the median of .
III-C Analysis of \frefest:noisevariance
We start with the blind noise power estimator defined in \frefest:noisevariance. We have the following key result. The proof is given in \frefapp:mainresult.
Theorem 1**.**
Let be a noisy BCG random vector with PDF as in \frefdef:noisyBCG and with activity rate satisfying
[TABLE]
Let a lower bound LB and an upper bound UB be defined as follows:
[TABLE]
Then, the average noise power satisfies555Here we simplify the notation: converges in probability to , and strictly speaking this latter expression is the upper bound.
[TABLE]
\fref
thm:mainresult has the following key implications: (i) In the large-dimension limit, the proposed blind estimate bounds the average noise power from above, i.e., we have developed a pessimistic estimator. (ii) If or , then in \frefeq:yummysandwichbound, and therefore . Thus, either for or , the proposed estimate is exact, i.e., \widehat{N}_{0}\xrightarrow[{\raisebox{3.0pt}[0.0pt][0.0pt]{\scriptstyle{D\to\infty}}}]{{\raisebox{-0.5pt}[0.0pt][0.0pt]{\scriptstyle{prob.}}}}\mathsf{m}_{Z}/\log(2)=N_{0}. We summarize this important insight in the following remark.
Remark 1**.**
In the large-dimension limit (), the proposed blind nonparametric estimate is pessimistic (i.e., overestimates the average noise power ), and becomes exact at low SNR or low activity rate (i.e., for sparse vectors).
Next, we present bounds on the relative error of \frefest:noisevariance. These bounds depend on the activity rate and the SNR. The proof is given in \frefapp:errorproof.
Corollary 1**.**
For as in \frefeq:probabilitycondition, the relative error of \frefest:noisevariance in the large-dimension limit is bounded as follows:
[TABLE]
An upper bound for the relative error can be obtained if (i) an upper bound on the SNR is known, or (ii) an upper bound on is known, since is nondecreasing for . In addition, we confirm the second implication discussed below \frefthm:mainresult: \frefcor:errorbound implies that if (irrespective of the SNR) or (irrespective of the sparsity), then the proposed estimator becomes exact, i.e., and therefore \widehat{N}_{0}\xrightarrow[{\raisebox{3.0pt}[0.0pt][0.0pt]{\scriptstyle{D\to\infty}}}]{{\raisebox{-0.5pt}[0.0pt][0.0pt]{\scriptstyle{prob.}}}}N_{0}.
III-D Analysis of \frefest:signalpower
For the blind estimate of the average signal power , we use the following lemma, which is derived from the fact that the entries of the vector are i.i.d. with expected value of , .
Lemma 2**.**
Let be a noisy BCG random vector with PDF as in \frefdef:noisyBCG. Then, according to the strong law of large numbers we have
[TABLE]
To obtain \frefest:signalpower in \frefeq:signalpowerestimator, we construct a blind estimator of by taking the left side of \frefeq:limit_of_Es and replacing the average noise power with the blind estimate from \frefest:noisevariance. To avoid negative values of that have no physical meaning, we assign a value of zero to our estimate if is negative. Since the estimate overestimates the true average noise power , the blind estimate in \frefeq:signalpowerestimator tends to underestimate the signal power. From \frefthm:mainresult it follows that for or , the blind signal power estimate is exact.
III-E Analysis of \frefest:SNR
The blind SNR estimator is obtained by simply taking the ratio of in \frefeq:signalpowerestimator and in \frefeq:noiseestimator. For , the blind signal power estimate underestimates the average signal power and the noise power estimate overestimates the average noise power, which means that the blind SNR estimate in \frefeq:SNRestimateomg underestimates the SNR. From \frefthm:mainresult it follows that for with either or the blind SNR estimate is exact.
III-F Analysis of \frefest:MSE
In order to analyze \frefest:MSE, we first assume that the average noise power is known. For this scenario, we can borrow the following two theorems from [8].
Theorem 2** (Thm. 1 of [8]).**
Consider \frefsys:systemmodel2. Then, Stein’s unbiased risk estimate given by
[TABLE]
is an unbiased estimate of the MSE so that
Theorem 3** (Thm. 3 of [8]).**
If is pseudo-Lipschitz, then SURE in \frefeq:complexSURE converges to the MSE in the large-dimension limit, i.e., we have
\fref
thm:SUREconvergence implies that if were known perfectly, then one could perfectly estimate the MSE in the large-dimension limit without knowledge of the sparse signal vector . For smaller values of the dimension , \frefthm:MSEappox only ensures equality in expectation (while the estimator remains MSE-optimal). Equality in expectation means that some realizations will underestimate and some realizations will overestimate the true MSE.666 We have to keep in mind that we use the estimated MSE to determine parameters in the estimation function that minimize the MSE for each given realization of . Therefore, offsets that depend on the realization of the noisy observation can be treated as a constant and thus be ignored, even if these offsets cause the MSE to take on negative values. In other words, we are not interested in the true value of the MSE, but rather in the shape of the MSE function with respect to the parameters in .
\fref
est:MSE is a blind version of SURE, in which we have replaced the true average noise power by its estimate . Consequently, for and either or , we have that: (i) \frefrem:Nolargedimension states will be exact, from which it follows that , (ii) \frefthm:SUREconvergence ensures , and therefore (iii) \frefest:MSE will be exact () in this scenario. For higher values of or SNR, we know that tends to overestimate , but since this estimated quantity appears twice in \frefeq:MSEnonparametricexplicitform with different signs, we cannot derive a simple rule that states whether \frefest:MSE tends to underestimate or overestimate the MSE.
III-G Analysis of \frefest:sandwich
\fref
est:sandwich is derived as the mean of the lower and upper bounds in \frefeq:yummysandwichbound, utilizing the SNR estimate from \frefest:SNR and an activity rate estimate of the user’s choice. \frefest:sandwich often improves the performance (achieves lower bias) compared to \frefest:noisevariance, especially at high SNR. In contrast to \frefest:noisevariance, we no longer know if the noise power from \frefest:sandwich is being overestimated or underestimated. As this estimator takes as a parameter, it is especially useful in applications where is known a priori or bounded (e.g., in OFDM systems the number of nonzero delay taps of the channel’s impulse response should not exceed the cyclic prefix length).
III-H Analysis of \frefest:p
To estimate the activity rate, we can use the equivalence of vector norms [45] that states holds for any vector if . In particular, it holds for a vector of length that contains only the nonzero entries of the sparse vector . For such vector, we have that . Since the entries of that are zero do not contribute to these norms, we note that and , and therefore
[TABLE]
Using \frefeq:sparsenorminequality, we can obtain a lower bound for the activity rate777The activity rate is . When is finite, we have .:
[TABLE]
The inequality in \frefeq:norminequalitybound holds with equality if the nonzero entries of the signal are constant-modulus, i.e., if , . We obtain the blind estimator from the left side of \frefeq:norminequalitybound, by replacing with its noisy version . With this substitution the inequality is not preserved (except if ), but we use that definition of as a rough activity rate estimate instead of picking an arbitrary value.
III-I Analysis of \frefest:EM
\fref
est:EM is a specialized variant of a classical EM algorithm for a two-component Gaussian mixture [42], adapted to complex-valued and zero-mean variables. We consider signal and noise power estimation from a noisy BCG signal as in \frefdef:noisyBCG. To understand it as a Gaussian source-separation problem, we consider that each entry of is a realization of either (i) just noise with distribution , or (ii) signal plus noise with distribution . Just-noise realizations occur with probability , while signal-plus-noise realizations occur with probability . Using EM, we estimate the variances of the circularly-symmetric complex Gaussians and , and mixture weights and . We use our previous knowledge to set the mean of the two distributions to zero, unlike classical EM algorithms that also estimate the means. We make the following observations: (i) This model allows any signal sparsity, as opposed to \frefest:noisevariance which assumes a maximum activity rate . (ii) In the low SNR regime, EM may not be able to separate the noise and signal components, as . (iii) The accuracy and the complexity of the algorithm will depend on the maximum number of iterations , the tolerance , the variance and weight initializations, and the noisy realization .
To avoid EM converging to pathological solutions with arbitrary initialization, we initialize the algorithm with the following two minimum assumptions: (i) The signal is sparse, or equivalently , and (ii) the power of the entries of that contain only noise is smaller than the power of the entries of that contain signal plus noise, or equivalently . This translates to initializing the Gaussian mixture variances and weights with , , , and . We verify that for this initialization, the average power of the mixture is , as expected.
IV Synthetic Results
We now characterize the accuracy of the estimators proposed in \frefsec:nonparametricestimators. We use the sparse signal model in \frefdef:noisyBCG. Without loss of generality, we fix the noise power to , while varying the signal power , the activity rate , and the dimension of the vectors. For different sets of parameters, we perform Monte–Carlo simulations with trials. In the plots, the thicker line with markers shows the average performance of an estimator, while the shaded area shows the region closer than one standard deviation away from the mean performance, a measure of the precision of the estimator.
IV-A Evaluation of the Noise Power, Signal Power, SNR, and Activity Rate Estimators
\fref
fig:estNo shows the effect of the SNR on the performance of the proposed blind nonparametric estimator from \frefeq:noiseestimator and the proposed blind parametric estimate from \frefeq:sandwich_estimator, for which we only include results using with and for the activity rate estimate, as these parameters showed the best performance in our simulations, outperforming other values of and , and a fixed-value of which is the center of the simulated range . We also simulate the baseline EM estimate described in \frefest:EM, initialized with and , a maximum of iterations and early stopping if the total parameter change is below %. As a baseline, we plot the genie-aided estimator that has separate knowledge of and the reference parameter .
\fref
fig:varyingSNR shows the effect of the SNR on the performance of the proposed signal power and SNR estimators for an activity rate of and a dimension of . In this case, , the genie-aided estimators that have separate knowledge of and are and , and the reference parameters are and .
From Figures 1 and 2, we observe the following facts about the blind nonparametric estimators: (i) For sparse vectors (), our estimators have a precision comparable to that of the genie-aided estimators even for a small sample size of . (ii) The precision of all considered estimators decreases as increases. (iii) As predicted by our theory, the average noise power is overestimated while the signal power and SNR are underestimated. (iv) At low SNR, the median-based estimators for these three quantities become exact. We also observe that the proposed blind parametric estimate with is more accurate than the blind nonparametric estimate at high SNR. However, has fewer theoretical guarantees and is not an upper bound on .
\fref
fig:pest shows the accuracy of the blind, parametric activity rate \frefest:p. We see that at low and high SNR, tends to overestimate while tends to underestimate it. Overall, results in the best performance when combined with \frefest:sandwich. Admittedly, this is only a rough estimator and we include it as an example of what could be plugged into \frefest:sandwich or \frefest:EM. Nonetheless, we emphasize that side information about the signal’s sparsity should be utilized whenever available.
In comparison with EM (cf. Figures 1 and 2), our methods provide a less-accurate estimate at higher SNR, but require significantly lower complexity. The complexity of the baseline EM algorithm (in terms of the number of operations such as real-valued additions, real-valued multiplications, and exponentials) is more than operations—with early stopping, the average number of iterations observed in our simulations ranges from to depending on the SNR. In contrast, our proposed median-based noise estimator has an average complexity of no more than operations if the median is computed using quickselect [39], and avoids the evaluation of operations such as exponentials and divisions. Hence, our proposed blind estimator is more than less complex than the baseline EM algorithm, which renders our method suitable (i) for low-complexity parameter estimation and (ii) as a potential initializer for EM-based estimators.
IV-B Accelerated Convergence of EM Using Median-Based Initialization
\fref
fig:iterations shows the effect of initialization on the EM algorithm. To study the rate of convergence, we disable early stopping by setting so that the number of iterations is always , and plot the relative error as we vary . We compare the convergence of (i) the accelerated EM algorithm (diamond markers) which is initialized with the blind nonparametric estimate , and (ii) the baseline EM algorithm (circular markers) initialized with a fixed initialization of , which corresponds to setting the noise power to of the received power . We simulated various values of and SNR, and picked four examples that are representative. At low SNR or high sparsity (low ), the accelerated EM algorithm converges already in the first iteration. In contrast, the baseline EM algorithm converges in more than 16 iterations in some cases. The only case we observe the baseline to outperform the accelerated EM algorithm is in \freffig:p0p4SNR5, in which (i) the baseline has advantage since coincides exactly with the true value of , and (ii) the SNR is high and the sparsity is low, making it the worst case for the estimate used by the accelerated EM. We also examine the effect of initializing the activity rate with (i) a fixed value of 0.25, versus (ii) the blind parametric estimate , and we observe no significant difference, especially for the preferred accelerated EM algorithm; however, as showed superior performance than a fixed value when used in the parametric noise power estimator , we prefer when no side information about the signal’s sparsity is available.
IV-C Evaluation of the MSE Estimator
To evaluate the performance of the MSE estimator, we consider \frefsys:systemmodel2 with being the soft-thresholding function defined as
[TABLE]
where the denoising threshold is a real number .
\fref
fig:MSE shows two realizations of the estimated MSE as a function of the tuning parameter . The only reference in this case is the genie-aided estimator . We picked two examples that are representative of what we have observed through multiple experiments with different system parameters to illustrate the following observations: (i) If the MSE function has a pronounced minimum as in \freffig:MSE_SNR10, then the value of that minimizes the blind estimate tends to be very close to the value that minimizes the genie-aided MSE function. (ii) If the MSE function has a less pronounced minimum as in \freffig:MSE_SNR0p5, then the value of that minimizes the blind estimate may be far from the value that minimizes the genie-aided MSE function. In spite of that, because the MSE function is flat near the minimum, the genie-aided MSE function evaluated at these two values of returns values that are similar. In other words, (i) and (ii) summarize our observations that our algorithm finds a near-optimal (sub-optimal) denoising threshold when the MSE of the denoised channel is (not) sensitive to . Note that here we have only picked two representative realizations; in \frefsec:channel_denoising, we validate our estimator with quantitative results by showing the denoising performance averaged over many realizations.
V Applications to Nonparametric Channel-Vector Denoising
We show three applications in wireless systems, in which the quality of channel estimates is essential for data detection. Concretely, we show that our algorithms can be applied to adaptively denoise pilot-based channel estimates, resulting in a reduced (improved) bit-error-rate (BER).
V-A Infinite-Resolution Massive Multiuser MIMO System
We start with an application of \frefest:MSE for beamspace channel estimation. As in [7], we simulate an uplink massive multiuser (MU) MIMO system in which single-antenna user equipments (UEs) transmit channel-estimation pilots and data to a basestation (BS) equipped with a uniform linear array of antenna elements. The UEs are randomly placed with a uniform distribution in a [math] circular sector around the BS, with a minimum distance of m and maximum distance of m from the BS. A minimum angular separation of [math] between UEs is enforced. We assume UE-side perfect power control (UEs adjust their transmit power so that the received power at the BS is equal for all UEs), and we ignore quantization at transmitter and receiver sides, assuming infinite-resolution signals.
We simulate a noiseless channel matrix using line-of-sight (LoS) realizations from the mmMAGIC QuaDRiGa model [46] with a carrier frequency of GHz. Each complex-valued entry of the channel matrix contains the attenuation and phase between the th UE and the th BS antenna. For the channel estimation step, the UEs transmit orthogonal pilots. The maximum likelihood (ML) estimate of the channel matrix is obtained by right-multiplying the (noisy) received pilot sequence with the inverse of the orthogonal pilot matrix, resulting in
[TABLE]
where is the antenna-domain channel matrix, is complex Gaussian channel estimation noise with power per complex entry, and is the ML channel estimate, which is a noisy observation of . The beamspace representation of the ML estimate is obtained by taking a spatial Fourier transform across the antenna array resulting in
[TABLE]
Here, beamspace-domain quantities are designated by a tilde. Then, is the beamspace channel matrix, has the same distribution as as the discrete Fourier transform matrix is unitary, and is the beamspace ML channel estimate, which is a noisy observation of . Column indices of correspond to UEs, while row indices correspond to different angles-of-arrival to the BS. Since electromagnetic waves at high carrier frequencies experience strong attenuation, typical mmWave channels consist only of a small number of dominant propagation paths arriving at the BS. Thus, each column of (which is the beamspace channel vector of one UE) will be approximately sparse, with many entries being close to zero.
By writing each column of \frefeq:HMLbeamspace as an independent equation, we can express the channel estimation problem in the form of \frefsys:systemmodel1, that is, each beamspace channel vector (that contains only few nonzero entries) corresponds to a sparse signal . The sparsity property implies that we can perform denoising to improve the ML channel estimate. After channel estimation, all UEs transmit data simultaneously using uncoded 16-QAM symbols and the BS performs data detection using the estimated channel vectors and linear minimum MSE equalization.
\fref
fig:inf_res shows simulation results for Monte–Carlo trials. For different channel estimation methods, we compute the MSE of the channel estimates and the resulting BER. We simulate beamspace channel estimation (BEACHES) as in [7], which denoises the columns of in \frefeq:HMLbeamspace by applying the soft-thresholding function in \frefeq:soft-thresholding; the thresholding parameter is adaptively selected for each noisy observation by minimizing SURE using an algorithm that assumes perfect knowledge of the average noise power . We compare this to NP BEACHES, a new nonparametric BEACHES variant which also applies soft-thresholding to the columns of , but uses the (nonparametric) threshold that minimizes as in \frefest:MSE; since is a nonparametric version of SURE, NP BEACHES does not require knowledge of . In addition, we include a variant that we call EM BEACHES, which uses a version of in which in \frefeq:MSEnonparametricexplicitform is replaced by from \frefest:EM; for , we use and , a maximum of iterations and early stopping if the total parameter change is below %. The three versions of BEACHES as described above, after denoising the beamspace channel vectors, use the inverse Fourier transform to obtain an antenna-domain channel estimate to be used for data detection. As a reference, we show the performance of perfect channel state information (CSI) that uses the ground truth (noiseless) channel matrix , and ML estimation that simply takes the noisy observation in \frefeq:HML as the estimate.
From \freffig:inf_res, we observe that NP BEACHES achieves virtually the same performance as the original BEACHES algorithm (which requires knowledge of ), except at high SNR where \frefest:noisevariance tends to overestimate . We reiterate that NP BEACHES requires no parameters and exhibits the same low complexity of as the original BEACHES algorithm, because the latter already sorts the entries of , which we can reuse to compute the median in \frefest:noisevariance. We observe that EM BEACHES achieves higher (worse) MSE at low SNR and does not outperform NP BEACHES at higher SNR.
In summary, denoising methods can significantly improve the ML channel estimate. All three BEACHES variants achieve similar BER performance. However, BEACHES needs knowledge of the noise power and EM BEACHES exhibits higher complexity than our nonparametric estimate, which renders NP BEACHES the preferable denoising method in this application scenario.
V-B Low-Resolution Massive Multiuser MIMO System
Next, we consider the same uplink massive MU-MIMO system as \frefsec:infinite-res, but in this case each radio-frequency (RF) chain at the BS is equipped with a pair of 1-bit analog-to-digital converters (ADCs) to quantize the in-phase and quadrature baseband signals. Each RF chain applies a quantization function to the baseband signal, where . For simplicity, we assume that the pilot matrix is an identity, i.e., each UE has a dedicated time slot to transmit one pilot while all other UEs are silent. The receive pilots then correspond to the 1-bit version of the ML channel estimate, which we call 1-bit ML888 is simply the 1-bit version of , not to be confused with the maximum likelihood channel estimate given a one-bit observation.
[TABLE]
Here, quantization happens in the antenna domain and yet, when the quantized noisy channel is converted to beamspace, the sparse structure that is present in infinite-resolution beamspace channel vectors is also present in the coarsely quantized beamspace channel vectors. Thus,
[TABLE]
has sparse columns that can be denoised. For more details on the validity of this statement, see [9] where was decomposed in a linear combination of plus a residual.
\fref
fig:one_bit_res shows simulation results for Monte–Carlo trials. For different channel estimation methods, we compute the MSE and BER. All UEs simultaneously transmit uncoded QPSK symbol, and the BS uses the estimated channels in order to perform 1-bit Bussgang linear minimum MSE equalization as described in [47].
We simulate -BEACHES as in [9]. This denoising algorithm decomposes \frefeq:HML1bitbeamspace as , where represents the equivalent noise-plus-quantization error with average power per entry [9]. The -BEACHES algorithm denoises the columns of with the threshold that minimizes SURE, assuming perfect knowledge of . We also use the nonparametric algorithms NP BEACHES and EM BEACHES (described in \frefsec:infinite-res) to denoise the columns of . After denoising the beamspace channel vectors, these three BEACHES variants use the inverse Fourier transform to obtain an antenna-domain channel estimate. We compare these estimators with from \frefeq:HML1b, and with the perfect CSI estimate that uses the ground truth as the channel estimate.
Since NP BEACHES uses the median-based noise estimate (which in this case estimates the effective “noise” floor that includes quantization errors), it is robust to outliers and is able to achieve MSE and BER performance very close to -BEACHES that has perfect knowledge of the noise-plus-quantization power. The EM estimator, however, strongly relies on the distribution of the noise and signal being Gaussian. Here, the signal is a realistic channel vector which is not Gaussian; more importantly, contains the effect of noise but also quantization error, which means the equivalent noise also deviates from a Gaussian distribution. We attribute the higher (worse) BER of EM BEACHES to these two factors. We note that -BEACHES is designed specifically for 1-bit quantization and that the expression for (which requires knowledge of the noise power and the signal power) would be different if the ADCs use a different number of bits. In contrast, our nonparametric denoiser is agnostic to the quantizer’s resolution and automatically determines the power of the noise plus quantization, as long as the signal is approximately sparse and the noise is approximately Gaussian.
V-C Cell-Free Communication System
We simulate an uplink cell-free communication system with single-antenna UEs and single-antenna BSs. The UEs and BSs are randomly placed with a uniform distribution in a square with area. The UEs transmit orthogonal pilots followed by QPSK data. All of the UEs transmit simultaneously and the received signal at all the BSs is processed at a central processing unit (CPU) that performs channel estimation and linear minimum MSE detection.
We simulate a cell-free channel matrix using the model proposed by [48], with parameters as in [2] but without power control and with a transmit power of mW per UE. As in \frefeq:HML, the ML estimate of the channel matrix is obtained by right-multiplying the pilot sequence received at the CPU with the inverse of the orthogonal pilot matrix (we used a Hadamard pilot matrix), resulting in
[TABLE]
The columns of (or channel vectors) contain the attenuations and phases between one UE and all BSs. For each UE, the BSs that have LoS or are closer to this UE will receive significantly higher power than the other BSs that are not nearby. This means that in the cell-free system, the channel vectors are approximately sparse [11] and the ML estimate can be denoised. Although the thermal noise variance at different basestations may differ, we assume i.i.d. noise in this paper.
\fref
fig:cell_free shows the results of Monte–Carlo trials. On the left, we plot the CDF of the MSE of the channel estimates, and on the right, the CDF of the root-mean-squared-symbol-error (RMSSE). The RMSSE is a measure of how far the expected QPSK symbol is from the received data symbol after equalization with the channel estimates, and can be seen as equivalent to the error-vector-magnitude (EVM) for one UE.
In \freffig:cell_free, we observe a clear MSE improvement of the three denoising algorithms over the ML estimate: For a given value , there are more realizations of channel estimates whose MSE is smaller than for denoised channels than for ML. The fact that denoising improves the channel estimates is reflected in the RMSSE, since equalization is more effective and the obtained symbols are closer to the expected constellation points. We consider the RMSSE requirement of for QPSK from [49, Table 6.5.2.2-1]. The probability that a UE meets the requirement grows from with ML channel estimation, to with NP BEACHES or EM BEACHES denoising, an increase of . BEACHES with perfect knowledge of the noise power has a slight additional advantage, with a probability of meeting the requirement of .
VI Conclusions
We have proposed blind estimators for the average noise power, signal power, SNR, and MSE. Our estimators can be calculated at low complexity and only require the noisy observation vector, avoiding the need for additional pilot signals entirely. We have analyzed our estimators for a Bernoulli complex Gaussian sparsity model and evaluated their accuracy via simulations. Using three channel-vector denoising tasks in (i) a multi-antenna mmWave system, (ii) a 1-bit quantized multi-antenna mmWave system, and (iii) a cell-free system, we have demonstrated that our blind estimators can be used to develop a novel nonparametric denoiser that achieves comparable performance and the same complexity as BEACHES in [7, 8] which requires knowledge of the average noise power. We believe that the proposed blind estimators find potential use in a large number of other wireless communication applications that contain sparse complex-valued signals.
There are many avenues for future work. For signals that are less sparse (i.e., ), one may want to replace the median by a higher quantile and the scaling factor needs to be adjusted accordingly—a derivation of such estimators would follow immediately from our results in \frefsec:noise_estimator_analysis. Huber M-estimators [50] combine the idea of mean and median, and they may also prove useful for blind noise power estimation in the presence of sparse signals. In the case of non-Gaussian, non-circularly-symmetric, or non-i.i.d sparse signals, new estimators can be tailored to exploit specific statistical properties (e.g., structured sparsity). Extending the statistical model, e.g., to signals with correlation or structured sparsity, can lead to improved estimators and is left for future work. In the case of colored noise (e.g., stemming from interference or large variations in radio-frequency circuitry), noise whitening techniques could be considered.
Appendix A Proof of \frefthm:mainresult
A-A Prerequisites
In what follows, we will need the distribution of , where we assume is distributed according to \frefdef:noisyBCG. Given a circularly-symmetric complex Gaussian RV with variance , the RV is exponentially distributed with CDF , . Then, the CDF of each entry of the absolute-square noisy observation is as follows.
Definition 5** (Noisy BCG Power RV).**
Let be as in \frefdef:noisyBCG and let . Then, for , the CDF of each entry of is given by
[TABLE]
A-B Upper Bounds on the Median
We start with the following two upper bounds on the median of a noisy BCG power RV with CDF given in \frefeq:noisyBCGpower.
Lemma 3**.**
For a noisy BCG power RV in \frefdef:noisyBCGsquared with , the median is bounded from above by
[TABLE]
Proof.
We start from the definition of the median in \frefeq:median for the RV with CDF as in \frefeq:noisyBCGpower:
[TABLE]
Since the second term is nonnegative, we can omit it to obtain the following inequality:
[TABLE]
Note that this bound will be useful for vectors that are sparse, i.e., where is small. We can simplify \frefeq:expression0 as
[TABLE]
which leads to the upper bound on the median . In order to take the logarithm in \frefeq:logarithmcondition, we require .∎
Lemma 4**.**
For a noisy BCG power RV in \frefdef:noisyBCGsquared with , the median is bounded from above by
[TABLE]
Proof.
We start from the definition of the median as in \frefeq:niceform. Let us define the function with . We can now rewrite \frefeq:niceform as follows:
[TABLE]
The function is concave for . Therefore, to ensure concavity of in \frefeq:nicejensenexpression, we need
[TABLE]
The two conditions in \frefeq:twoconcavityconditions are guaranteed as long as . Because CDFs are nondecreasing functions, requiring is equivalent to requiring , which we can simplify as
[TABLE]
Finally, to ensure \frefeq:pconditionSNR holds for all values of and , we require
[TABLE]
which implies that the condition in \frefeq:probabilitycondition ensures concavity of . Then, assuming , we can now use Jensen’s inequality on the expression in \frefeq:nicejensenexpression to get
[TABLE]
We can now simplify this expression to
[TABLE]
which is the inequality in \freflem:smollemma2. ∎
A-C Lower Bound on the Median
We now establish the following lower bound on the median.
Lemma 5**.**
For a noisy BCG power RV in \frefdef:noisyBCGsquared with , the median is bounded from below by
[TABLE]
Proof.
We start from the definition of the median as in \frefeq:niceform. Since the exponential CDF for is concave in , Jensen’s inequality leads to
[TABLE]
We can simplify this expression to obtain the following bound
[TABLE]
which leads to the inequality in \freflem:smollemma3 we wanted to prove. ∎
A-D Combining the Results
For all values of and , we have that
[TABLE]
and we defined such that \widehat{N}_{0}\xrightarrow[{\raisebox{3.0pt}[0.0pt][0.0pt]{\scriptstyle{D\to\infty}}}]{{\raisebox{-0.5pt}[0.0pt][0.0pt]{\scriptstyle{prob.}}}}{\mathsf{m}_{Z}}/{\log(2)} according to \freflem:convergenceofmedian.
Finally, we can combine \frefeq:uperuperbound with \freflem:smollemma1, \freflem:smollemma2 and \freflem:smollemma3 to obtain \frefeq:yummysandwichbound.
Appendix B Proof of \frefcor:errorbound
Proof.
Let the relative error of \frefest:noisevariance be . Using the inequalities from \frefthm:mainresult and the quantities LB and UB defined there, we can bound as follows:
[TABLE]
By using \widehat{N}_{0}\xrightarrow[{\raisebox{3.0pt}[0.0pt][0.0pt]{\scriptstyle{D\to\infty}}}]{{\raisebox{-0.5pt}[0.0pt][0.0pt]{\scriptstyle{prob.}}}}{\mathsf{m}_{Z}}/{\log(2)} and replacing LB from \frefeq:LB and UB from \frefeq:UB into \frefeq:relativeerrorbound, after some simplifications, we obtain \frefeq:errorbound. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Gallyas-Sanhueza and C. Studer, “Blind SNR estimation and nonparametric channel denoising in multi-antenna mm Wave systems,” in IEEE Int. Conf. Commun. (ICC) , Jun. 2021, pp. 1–7.
- 2[2] H. Song, X. You, C. Zhang, O. Tirkkonen, and C. Studer, “Minimizing pilot overhead in cell-free massive MIMO systems via joint estimation and detection,” in Proc. IEEE Int. Workshop Signal Process. Advances Wireless Commun. (SPAWC) , May 2020, pp. 1–5.
- 3[3] T. Schenk, RF imperfections in high-rate wireless systems: impact and digital compensation . Springer Science & Business Media, 2008.
- 4[4] T. S. Rappaport, S. Sun, R. Mayzus, H. Zhao, Y. Azar, K. Wang, G. N. Wong, J. K. Schulz, M. Samimi, and F. Gutierrez, “Millimeter wave mobile communications for 5G cellular: It will work!” IEEE Access , vol. 1, pp. 335–349, May 2013.
- 5[5] F. Rusek, D. Persson, B. Kiong, E. G. Larsson, T. L. Marzetta, O. Edfors, and F. Tufvesson, “Scaling up MIMO: Opportunities and challenges with very large large arrays,” IEEE Signal Process. Mag. , vol. 30, no. 1, pp. 40–60, Jan. 2013.
- 6[6] 3GPP, “5G; NR; user equipment (UE) radio transmission and reception; part 1: Range 1 standalone,” Nov. 2020, TS 38.101-1 version 16.5.0 Rel. 16.
- 7[7] R. Ghods, A. Gallyas-Sanhueza, S. H. Mirfarshbafan, and C. Studer, “BEACHES: Beamspace channel estimation for multi-antenna mm Wave systems and beyond,” in Proc. IEEE Int. Workshop Signal Process. Advances Wireless Commun. (SPAWC) , Jul. 2019, pp. 1–5.
- 8[8] S. H. Mirfarshbafan, A. Gallyas-Sanhueza, R. Ghods, and C. Studer, “Beamspace channel estimation for massive MIMO mm Wave systems: Algorithm and VLSI design,” IEEE Trans. Circuits Sys. I (TCAS-I) , vol. 67, no. 12, pp. 5482–5495, Sep. 2020.
