Differentiable PAC-Bayes Objectives with Partially Aggregated Neural   Networks

Felix Biggs; Benjamin Guedj

arXiv:2006.12228·cs.LG·December 16, 2021

Differentiable PAC-Bayes Objectives with Partially Aggregated Neural Networks

Felix Biggs, Benjamin Guedj

PDF

TL;DR

This paper introduces a new class of partially-aggregated estimators for stochastic neural networks, improving gradient estimates and deriving a tighter, directly optimizable PAC-Bayesian bound, leading to easier training and better guarantees.

Contribution

It proposes partially-aggregated estimators, lower-variance gradient estimates, and a tighter PAC-Bayesian bound for stochastic neural networks, enhancing training and generalization guarantees.

Findings

01

Partially-aggregated estimators enable better ensemble averaging.

02

Lower-variance gradients improve training stability.

03

Tighter PAC-Bayesian bounds lead to stronger guarantees.

Abstract

We make three related contributions motivated by the challenge of training stochastic neural networks, particularly in a PAC-Bayesian setting: (1) we show how averaging over an ensemble of stochastic neural networks enables a new class of \emph{partially-aggregated} estimators; (2) we show that these lead to provably lower-variance gradient estimates for non-differentiable signed-output networks; (3) we reformulate a PAC-Bayesian bound for these networks to derive a directly optimisable, differentiable objective and a generalisation guarantee, without using a surrogate loss or loosening the bound. This bound is twice as tight as that of Letarte et al. (2019) on a similar network type. We show empirically that these innovations make training easier and lead to competitive guarantees.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.