Multiple Hypotheses Testing For Variable Selection
Florian Rohart

TL;DR
This paper introduces two novel multiple hypotheses testing methods for variable selection in high-dimensional sparse linear models, demonstrating superior performance over FDR and Lasso in both low and high-dimensional settings.
Contribution
The paper proposes two new multiple hypotheses testing procedures for variable selection, extending existing methods with non-asymptotic guarantees and improved accuracy.
Findings
Methods outperform FDR and Lasso in relevant variable estimation.
Procedures are powerful under certain signal conditions.
Applicable in both p<n and p>n scenarios.
Abstract
Many methods have been developed to estimate the set of relevant variables in a sparse linear model Y= XB+e where the dimension p of B can be much higher than the length n of Y. Here we propose two new methods based on multiple hypotheses testing, either for ordered or non-ordered variables. Our procedures are inspired by the testing procedure proposed by Baraud et al (2003). The new procedures are proved to be powerful under some conditions on the signal and their properties are non asymptotic. They gave better results in estimating the set of relevant variables than both the False Discovery Rate (FDR) and the Lasso, both in the common case (p<n) and in the high-dimensional case (p>n).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Statistical Process Monitoring · Statistical Distribution Estimation and Applications
