Stability and Deviation Optimal Risk Bounds with Convergence Rate   $O(1/n)$

Yegor Klochkov; Nikita Zhivotovskiy

arXiv:2103.12024·cs.LG·November 19, 2021

Stability and Deviation Optimal Risk Bounds with Convergence Rate $O(1/n)$

Yegor Klochkov, Nikita Zhivotovskiy

PDF

1 Video

TL;DR

This paper improves high probability risk bounds for stable algorithms to an optimal rate of O(1/n) under the Bernstein condition, resolving a longstanding open problem in stochastic convex optimization.

Contribution

It demonstrates that the O(1/√n) sampling error can be eliminated under the Bernstein condition, achieving near-optimal risk bounds for empirical risk minimization and gradient descent.

Findings

01

High probability excess risk bounds of O(log n/n] are achievable.

02

The results apply to any empirical risk minimization method under the Bernstein condition.

03

O(1/n) bounds are obtained without smoothness assumptions for gradient descent.

Abstract

The sharpest known high probability generalization bounds for uniformly stable algorithms (Feldman, Vondr\'{a}k, 2018, 2019), (Bousquet, Klochkov, Zhivotovskiy, 2020) contain a generally inevitable sampling error term of order $Θ (1/ n)$ . When applied to excess risk bounds, this leads to suboptimal results in several standard stochastic convex optimization problems. We show that if the so-called Bernstein condition is satisfied, the term $Θ (1/ n)$ can be avoided, and high probability excess risk bounds of order up to $O (1/ n)$ are possible via uniform stability. Using this result, we show a high probability excess risk bound with the rate $O (lo g n / n)$ for strongly convex and Lipschitz losses valid for \emph{any} empirical risk minimization method. This resolves a question of Shalev-Shwartz, Shamir, Srebro, and Sridharan (2009). We discuss how $O (lo g n / n)$ high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Stability and Deviation Optimal Risk Bounds with Convergence Rate $O(1/n)$· slideslive