Estimating Generalization Performance Along the Trajectory of Proximal   SGD in Robust Regression

Kai Tan; Pierre C. Bellec

arXiv:2410.02629·math.ST·November 5, 2024

Estimating Generalization Performance Along the Trajectory of Proximal SGD in Robust Regression

Kai Tan, Pierre C. Bellec

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper develops estimators to accurately track the generalization error of gradient-based algorithms in high-dimensional robust regression, enabling optimal stopping and improved understanding of model performance.

Contribution

It introduces consistent estimators for the generalization error along the trajectory of GD, SGD, and proximal variants in high-dimensional robust regression with heavy-tailed errors.

Findings

01

Estimators accurately predict generalization error in various robust regression models.

02

Proposed risk estimates effectively serve as proxies for actual generalization error.

03

Simulations confirm the estimators' effectiveness in practical scenarios.

Abstract

This paper studies the generalization performance of iterates obtained by Gradient Descent (GD), Stochastic Gradient Descent (SGD) and their proximal variants in high-dimensional robust regression problems. The number of features is comparable to the sample size and errors may be heavy-tailed. We introduce estimators that precisely track the generalization error of the iterates along the trajectory of the iterative algorithm. These estimators are provably consistent under suitable conditions. The results are illustrated through several examples, including Huber regression, pseudo-Huber regression, and their penalized variants with non-smooth regularizer. We provide explicit generalization error estimates for iterates generated from GD and SGD, or from proximal SGD in the presence of a non-smooth regularizer. The proposed risk estimates serve as effective proxies for the actual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kaitan365/sgd-generlization-errors
pytorchOfficial

Videos

Estimating Generalization Performance Along the Trajectory of Proximal SGD in Robust Regression· slideslive

Taxonomy

TopicsFace and Expression Recognition · Advanced Statistical Methods and Models · Grey System Theory Applications

MethodsStochastic Gradient Descent