New developments in Sparse PLS regression
J\'er\'emy Magnanensi, Myriam Maumy-Bertrand, Nicolas Meyer,, Fr\'ed\'eric Bertrand

TL;DR
This paper introduces a bootstrap-based variable selection method for PLS regression that improves stability and accuracy in high-dimensional genomic data analysis by eliminating the need for cross-validation.
Contribution
The authors develop a new bootstrap-based approach for predictor significance testing and adapt sparse PLS methods, enhancing variable selection stability and predictive performance.
Findings
Bootstrap method outperforms CV in variable significance testing.
Enhanced stability and accuracy in gene expression classification.
Better separation of noise from relevant signals in high-dimensional data.
Abstract
Methods based on partial least squares (PLS) regression, which has recently gained much attention in the analysis of high-dimensional genomic datasets, have been developed since the early 2000s for performing variable selection. Most of these techniques rely on tuning parameters that are often determined by cross-validation (CV) based methods, which raises important stability issues. To overcome this, we have developed a new dynamic bootstrapbased method for significant predictor selection, suitable for both PLS regression and its incorporation into generalized linear models (GPLS). It relies on the establishment of bootstrap confidence intervals, that allows testing of the significance of predictors at preset type I risk , and avoids the use of CV. We have also developed adapted versions of sparse PLS (SPLS) and sparse GPLS regression (SGPLS), using a recently introduced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
