Robust covariance estimation with missing values and cell-wise contamination
Karim Lounici, Gr\'egoire Pacreau

TL;DR
This paper introduces a new unbiased covariance estimator that handles missing data and cell-wise outliers without imputation, achieving high accuracy and computational efficiency in high-dimensional settings.
Contribution
It proposes a novel covariance estimator that works with missing values and outliers without imputation, suitable for high-dimensional data.
Findings
Outperforms existing methods in accuracy and stability
Effective in both low and high-dimensional datasets
Reduces computational time compared to state-of-the-art approaches
Abstract
Large datasets are often affected by cell-wise outliers in the form of missing or erroneous data. However, discarding any samples containing outliers may result in a dataset that is too small to accurately estimate the covariance matrix. Moreover, the robust procedures designed to address this problem require the invertibility of the covariance operator and thus are not effective on high-dimensional data. In this paper, we propose an unbiased estimator for the covariance in the presence of missing values that does not require any imputation step and still achieves near minimax statistical accuracy with the operator norm. We also advocate for its use in combination with cell-wise outlier detection methods to tackle cell-wise contamination in a high-dimensional and low-rank setting, where state-of-the-art methods may suffer from numerical instability and long computation times. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Sparse and Compressive Sensing Techniques · Advanced Statistical Methods and Models
