Differentiable PAC-Bayes Objectives with Partially Aggregated Neural Networks
Felix Biggs, Benjamin Guedj

TL;DR
This paper introduces a new class of partially-aggregated estimators for stochastic neural networks, improving gradient estimates and deriving a tighter, directly optimizable PAC-Bayesian bound, leading to easier training and better guarantees.
Contribution
It proposes partially-aggregated estimators, lower-variance gradient estimates, and a tighter PAC-Bayesian bound for stochastic neural networks, enhancing training and generalization guarantees.
Findings
Partially-aggregated estimators enable better ensemble averaging.
Lower-variance gradients improve training stability.
Tighter PAC-Bayesian bounds lead to stronger guarantees.
Abstract
We make three related contributions motivated by the challenge of training stochastic neural networks, particularly in a PAC-Bayesian setting: (1) we show how averaging over an ensemble of stochastic neural networks enables a new class of \emph{partially-aggregated} estimators; (2) we show that these lead to provably lower-variance gradient estimates for non-differentiable signed-output networks; (3) we reformulate a PAC-Bayesian bound for these networks to derive a directly optimisable, differentiable objective and a generalisation guarantee, without using a surrogate loss or loosening the bound. This bound is twice as tight as that of Letarte et al. (2019) on a similar network type. We show empirically that these innovations make training easier and lead to competitive guarantees.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
