The Sample Complexity of Robust Covariance Testing
Ilias Diakonikolas, Daniel M. Kane

TL;DR
This paper investigates the increased sample complexity required for robustly testing covariance matrices in high-dimensional Gaussian models under contamination, showing it is fundamentally harder than in the clean case.
Contribution
It establishes a tight lower bound of rac{d^2}{2} for sample complexity in robust covariance testing, demonstrating the problem's inherent difficulty compared to non-robust settings.
Findings
Sample complexity for robust testing is rac{d^2}{2} samples.
Robust covariance testing is as hard as robust covariance estimation.
In the absence of contamination, testing requires only O(d) samples.
Abstract
We study the problem of testing the covariance matrix of a high-dimensional Gaussian in a robust setting, where the input distribution has been corrupted in Huber's contamination model. Specifically, we are given i.i.d. samples from a distribution of the form , where is a zero-mean and unknown covariance Gaussian , is a fixed but unknown noise distribution, and is an arbitrarily small constant representing the proportion of contamination. We want to distinguish between the cases that is the identity matrix versus -far from the identity in Frobenius norm. In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses samples. Moreover, this sample upper bound was shown to be best possible, within constant factors. Our main result is that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Sparse and Compressive Sensing Techniques · Statistical Methods and Inference
