
TL;DR
This paper introduces the Dual Lasso Selector, a new method for variable selection in high-dimensional linear regression with correlated variables, supported by theoretical analysis and empirical comparisons.
Contribution
It develops a novel variable selection procedure based on the dual Lasso solution, introduces the Pseudo Irrepresentable Condition, and combines it with Ridge for improved prediction.
Findings
The Dual Lasso Selector effectively identifies correlated active variables.
The Pseudo Irrepresentable Condition is necessary and sufficient for consistent selection.
The combined DLSelect+Ridge method outperforms existing methods in accuracy and speed.
Abstract
We consider the problem of model selection and estimation in sparse high dimensional linear regression models with strongly correlated variables. First, we study the theoretical properties of the dual Lasso solution, and we show that joint consideration of the Lasso primal and its dual solutions are useful for selecting correlated active variables. Second, we argue that correlations among active predictors are not problematic, and we derive a new weaker condition on the design matrix, called Pseudo Irrepresentable Condition (PIC). Third, we present a new variable selection procedure, Dual Lasso Selector, and we prove that the PIC is a necessary and sufficient condition for consistent variable selection for the proposed method. Finally, by combining the dual Lasso selector further with the Ridge estimation even better prediction performance is achieved. We call the combination…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Advanced Causal Inference Techniques
