On the prediction loss of the lasso in the partially labeled setting
Pierre C. Bellec, Arnak S. Dalalyan, Edwin Grappin, Quentin, Paris

TL;DR
This paper analyzes the prediction risk bounds of the lasso estimator in semi-supervised and transductive regression settings with partial labels, providing new oracle inequalities and insights into unlabeled feature effects.
Contribution
It introduces adapted lasso methods and derives non-asymptotic risk bounds in high-dimensional, partially labeled regression, emphasizing the role of unlabeled features.
Findings
Unlabeled features can improve prediction risk bounds when the design matrix has small restricted eigenvalues.
New oracle inequalities are established for the adapted lasso in semi-supervised settings.
The bounds account for bias from model mis-specification and sparsity, and variance.
Abstract
In this paper we revisit the risk bounds of the lasso estimator in the context of transductive and semi-supervised learning. In other terms, the setting under consideration is that of regression with random design under partial labeling. The main goal is to obtain user-friendly bounds on the off-sample prediction risk. To this end, the simple setting of bounded response variable and bounded (high-dimensional) covariates is considered. We propose some new adaptations of the lasso to these settings and establish oracle inequalities both in expectation and in deviation. These results provide non-asymptotic upper bounds on the risk that highlight the interplay between the bias due to the mis-specification of the linear model, the bias due to the approximate sparsity and the variance. They also demonstrate that the presence of a large number of unlabeled features may have significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
