High probability generalization bounds for uniformly stable algorithms   with nearly optimal rate

Vitaly Feldman; Jan Vondrak

arXiv:1902.10710·cs.LG·June 25, 2019·40 cites

High probability generalization bounds for uniformly stable algorithms with nearly optimal rate

Vitaly Feldman, Jan Vondrak

PDF

Open Access

TL;DR

This paper establishes nearly optimal high-probability generalization bounds for uniformly stable algorithms, including stochastic gradient descent and regularized ERM, improving upon previous bounds and resolving open problems.

Contribution

The authors prove a nearly tight high-probability generalization bound for uniformly stable algorithms, applicable to multi-pass SGD and convex ERM, with a novel proof technique.

Findings

01

Bound of O(γ log(n) log(n/δ) + √(log(1/δ)/n)) on estimation error

02

First high-probability bounds for multi-pass SGD in convex settings

03

Nearly optimal rate matching sampling error for γ = O(1/√n)

Abstract

Algorithmic stability is a classical approach to understanding and analysis of the generalization error of learning algorithms. A notable weakness of most stability-based generalization bounds is that they hold only in expectation. Generalization with high probability has been established in a landmark paper of Bousquet and Elisseeff (2002) albeit at the expense of an additional $n$ factor in the bound. Specifically, their bound on the estimation error of any $γ$ -uniformly stable learning algorithm on $n$ samples and range in $[0, 1]$ is $O (γ n lo g (1/ δ) + lo g (1/ δ) / n)$ with probability $\geq 1 - δ$ . The $n$ overhead makes the bound vacuous in the common settings where $γ \geq 1/ n$ . A stronger bound was recently proved by the authors (Feldman and Vondrak, 2018) that reduces the overhead to at most $O (n^{1/4})$ . Still, both of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Sparse and Compressive Sensing Techniques