Pre-processing with Orthogonal Decompositions for High-dimensional Explanatory Variables
Xu Han, Ethan X Fang, Cheng Yong Tang

TL;DR
This paper introduces PROD, a pre-processing method using orthogonal decompositions to improve high-dimensional regression, especially when correlated variables violate key assumptions of LASSO.
Contribution
It proposes a novel orthogonal decomposition-based pre-processing technique that enhances LASSO performance in high-dimensional settings with correlated variables.
Findings
PROD improves variable selection accuracy in simulations.
PROD enhances predictive performance in real data analysis.
Theoretical analysis confirms benefits for high-dimensional penalized regression.
Abstract
Strong correlations between explanatory variables are problematic for high-dimensional regularized regression methods. Due to the violation of the Irrepresentable Condition, the popular LASSO method may suffer from false inclusions of inactive variables. In this paper, we propose pre-processing with orthogonal decompositions (PROD) for the explanatory variables in high-dimensional regressions. The PROD procedure is constructed based upon a generic orthogonal decomposition of the design matrix. We demonstrate by two concrete cases that the PROD approach can be effectively constructed for improving the performance of high-dimensional penalized regression. Our theoretical analysis reveals their properties and benefits for high-dimensional penalized linear regression with LASSO. Extensive numerical studies with simulations and data analysis show the promising performance of the PROD.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Sparse and Compressive Sensing Techniques · Optimal Experimental Design Methods
MethodsLinear Regression
