Sharp Variable Selection of a Sparse Submatrix in a High-Dimensional Noisy Matrix
Cristina Butucea, Yuri I. Ingster, Irina Suslina

TL;DR
This paper develops a precise statistical method for identifying a sparse submatrix with elevated mean in a large noisy Gaussian matrix, establishing sharp thresholds for successful variable selection.
Contribution
It introduces a new sharp variable selection procedure with proven optimality and characterizes the exact thresholds for detection and selection in high-dimensional Gaussian matrices.
Findings
Established sufficient conditions for variable selection based on mean elevation
Proved minimax lower bounds for the problem
Identified a gap between detection and selection thresholds
Abstract
We observe a matrix of independent, identically distributed Gaussian random variables which are centered except for elements of some submatrix of size where the mean is larger than some . The submatrix is sparse in the sense that and tend to 0, whereas and tend to infinity. We consider the problem of selecting the random variables with significantly large mean values. We give sufficient conditions on as a function of and and construct a uniformly consistent procedure in order to do sharp variable selection. We also prove the minimax lower bounds under necessary conditions which are complementary to the previous conditions. The critical values separating the necessary and sufficient conditions are sharp (we show exact constants). We note a gap between the critical values for selection of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
