A non-asymptotic upper bound in prediction for the PLS estimator

Luca Castelli (ICJ; PSPM); Ir\`ene Gannaz (G-SCOP\_GROG; G-SCOP),; Cl\'ement Marteau (ICJ; PSPM)

arXiv:2410.10237·math.ST·October 15, 2024

A non-asymptotic upper bound in prediction for the PLS estimator

Luca Castelli (ICJ, PSPM), Ir\`ene Gannaz (G-SCOP\_GROG, G-SCOP),, Cl\'ement Marteau (ICJ, PSPM)

PDF

Open Access

TL;DR

This paper derives non-asymptotic upper bounds on the prediction risk of the PLS estimator in high-dimensional linear models, highlighting scenarios of variability and proposing Ridge regularization as a remedy.

Contribution

It provides the first non-asymptotic risk bounds for PLS in high dimensions and introduces Ridge regularization to mitigate variability issues.

Findings

01

Risk bounds depend on sample size, noise, design matrix, and PLS components.

02

Variability of PLS can explode in certain scenarios.

03

Ridge regularization can stabilize PLS performance.

Abstract

We investigate the theoretical performances of the Partial Least Square (PLS) algorithm in a high dimensional context. We provide upper bounds on the risk in prediction for the statistical linear model when considering the PLS estimator. Our bounds are non-asymptotic and are expressed in terms of the number of observations, the noise level, the properties of the design matrix, and the number of considered PLS components. In particular, we exhibit some scenarios where the variability of the PLS may explode and prove that we can get round of these situations by introducing a Ridge regularization step. These theoretical findings are illustrated by some numerical simulations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems · Spectroscopy and Chemometric Analyses · Blind Source Separation Techniques