Fast-Rate Loss Bounds via Conditional Information Measures with   Applications to Neural Networks

Fredrik Hellstr\"om; Giuseppe Durisi

arXiv:2010.11552·cs.LG·March 11, 2021

Fast-Rate Loss Bounds via Conditional Information Measures with Applications to Neural Networks

Fredrik Hellstr\"om, Giuseppe Durisi

PDF

TL;DR

This paper introduces a new framework for deriving fast-rate bounds on test loss for randomized algorithms using conditional information measures, with applications to neural networks trained on MNIST datasets.

Contribution

It develops bounds that depend on conditional information density, achieving a $1/n$ decay rate, improving over previous $1/\sqrt{n}$ bounds, and demonstrates their practical usefulness on neural networks.

Findings

01

Bounds decay as 1/n with bounded conditional information density.

02

Tail bounds provide nonvacuous estimates of neural network test loss.

03

Framework applicable to PAC-Bayesian and single-draw settings.

Abstract

We present a framework to derive bounds on the test loss of randomized learning algorithms for the case of bounded loss functions. Drawing from Steinke & Zakynthinou (2020), this framework leads to bounds that depend on the conditional information density between the the output hypothesis and the choice of the training set, given a larger set of data samples from which the training set is formed. Furthermore, the bounds pertain to the average test loss as well as to its tail probability, both for the PAC-Bayesian and the single-draw settings. If the conditional information density is bounded uniformly in the size $n$ of the training set, our bounds decay as $1/ n$ . This is in contrast with the tail bounds involving conditional information measures available in the literature, which have a less benign $1/ n$ dependence. We demonstrate the usefulness of our tail bounds by showing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.