On concentration for (regularized) empirical risk minimization

Sara van de Geer; Martin Wainwright

arXiv:1512.00677·math.ST·January 12, 2016

On concentration for (regularized) empirical risk minimization

Sara van de Geer, Martin Wainwright

PDF

TL;DR

This paper investigates the concentration behavior of empirical risk minimizers, showing that after normalization, their risk converges to a single point, extending previous results to regularized and more general loss functions.

Contribution

It generalizes and sharpens existing concentration results for empirical risk minimizers to include regularized least squares and other loss functions, using novel proof techniques.

Findings

01

Risk of empirical minimizer concentrates on a single point after normalization.

02

Results extend to regularized least squares with convex penalties.

03

Generalization to other loss functions like negative log-likelihood.

Abstract

Rates of convergence for empirical risk minimizers have been well studied in the literature. In this paper, we aim to provide a complementary set of results, in particular by showing that after normalization, the risk of the empirical minimizer concentrates on a single point. Such results have been established by~\cite{chatterjee2014new} for constrained estimators in the normal sequence model. We first generalize and sharpen this result to regularized least squares with convex penalties, making use of a "direct" argument based on Borell's theorem. We then study generalizations to other loss functions, including the negative log-likelihood for exponential families combined with a strictly convex regularization penalty. The results in this general setting are based on more "indirect" arguments as well as on concentration inequalities for maxima of empirical processes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.