Efficient Clustering of Correlated Variables and Variable Selection in High-Dimensional Linear Models
Niharika Gauraha, Swapan K. Parui

TL;DR
This paper proposes the Adaptive Cluster Lasso (ACL), a three-stage method for variable selection in high-dimensional linear models with correlated variables, combining initial Lasso selection, clustering, and sparse estimation.
Contribution
The paper introduces ACL, a novel three-stage procedure that improves variable selection and clustering in high-dimensional correlated data, with proven consistency and efficiency.
Findings
ACL is consistent in identifying true group structures.
ACL outperforms traditional methods in simulated datasets.
The method demonstrates effective group selection in pseudo-real data.
Abstract
In this paper, we introduce Adaptive Cluster Lasso(ACL) method for variable selection in high dimensional sparse regression models with strongly correlated variables. To handle correlated variables, the concept of clustering or grouping variables and then pursuing model fitting is widely accepted. When the dimension is very high, finding an appropriate group structure is as difficult as the original problem. The ACL is a three-stage procedure where, at the first stage, we use the Lasso(or its adaptive or thresholded version) to do initial selection, then we also include those variables which are not selected by the Lasso but are strongly correlated with the variables selected by the Lasso. At the second stage we cluster the variables based on the reduced set of predictors and in the third stage we perform sparse estimation such as Lasso on cluster representatives or the group Lasso…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference
