Stability Selection via Variable Decorrelation
Mahdi Nouraie, Connor Smith, Samuel Muller

TL;DR
This paper introduces a simple decorrelation step before applying Lasso to improve variable selection stability, effective in both high- and low-dimensional data, with empirical validation and a supporting R package.
Contribution
The paper proposes a variable decorrelation method that enhances Lasso stability and satisfies the irrepresentable condition, applicable across different data dimensions and variable selection techniques.
Findings
Decorrelating variables improves Lasso stability.
The method satisfies the irrepresentable condition after decorrelation.
Empirical results demonstrate effectiveness across techniques.
Abstract
The Lasso is a prominent algorithm for variable selection. However, its instability in the presence of correlated variables in the high-dimensional setting is well-documented. Although previous research has attempted to address this issue by modifying the Lasso loss function, this paper introduces an approach that simplifies the data processed by Lasso. We propose that decorrelating variables before applying the Lasso improves the stability of variable selection regardless of the direction of correlation among predictors. Furthermore, we highlight that the irrepresentable condition, which ensures consistency for the Lasso, is satisfied after variable decorrelation under two assumptions. In addition, by noting that the instability of the Lasso is not limited to high-dimensional settings, we demonstrate the effectiveness of the proposed approach for low-dimensional data. Finally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization
