Generalization Error Bounds for Noisy, Iterative Algorithms

Ankit Pensia; Varun Jog; Po-Ling Loh

arXiv:1801.04295·cs.LG·January 16, 2018

Generalization Error Bounds for Noisy, Iterative Algorithms

Ankit Pensia, Varun Jog, Po-Ling Loh

PDF

TL;DR

This paper derives generalization error bounds for a wide class of noisy, iterative algorithms in machine learning, including SGLD and SGHMC, based on mutual information and Markovian structures.

Contribution

It extends existing bounds to cover noisy, Markovian iterative algorithms and their various output functions, broadening applicability in statistical learning theory.

Findings

01

Bounds apply to stochastic gradient Langevin dynamics (SGLD).

02

Error bounds hold for last iterate and averaged outputs.

03

Framework accommodates non-uniform data sampling.

Abstract

In statistical learning theory, generalization error is used to quantify the degree to which a supervised machine learning algorithm may overfit to training data. Recent work [Xu and Raginsky (2017)] has established a bound on the generalization error of empirical risk minimization based on the mutual information $I (S; W)$ between the algorithm input $S$ and the algorithm output $W$ , when the loss function is sub-Gaussian. We leverage these results to derive generalization error bounds for a broad class of iterative algorithms that are characterized by bounded, noisy updates with Markovian structure. Our bounds are very general and are applicable to numerous settings of interest, including stochastic gradient Langevin dynamics (SGLD) and variants of the stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm. Furthermore, our error bounds hold for any output function computed over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.