A PAC-Bayesian Tutorial with A Dropout Bound
David McAllester

TL;DR
This tutorial reviews PAC-Bayesian theory, focusing on three bounds including an Occam bound, a PAC-Bayesian bound for posteriors, and a training-variance bound, with applications to dropout and regularization.
Contribution
It provides a comprehensive overview of PAC-Bayesian bounds, including a novel bound for dropout training and insights into variance reduction methods.
Findings
PAC-Bayesian bounds handle infinite precision parameters and dropout.
The training-variance bound offers a new perspective on bias-variance analysis.
Dropout training can be analyzed within the PAC-Bayesian framework.
Abstract
This tutorial gives a concise overview of existing PAC-Bayesian theory focusing on three generalization bounds. The first is an Occam bound which handles rules with finite precision parameters and which states that generalization loss is near training loss when the number of bits needed to write the rule is small compared to the sample size. The second is a PAC-Bayesian bound providing a generalization guarantee for posterior distributions rather than for individual rules. The PAC-Bayesian bound naturally handles infinite precision rule parameters, regularization, {\em provides a bound for dropout training}, and defines a natural notion of a single distinguished PAC-Bayesian posterior distribution. The third bound is a training-variance bound --- a kind of bias-variance analysis but with bias replaced by expected training loss. The training-variance bound dominates the other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Neural Networks and Applications · Blind Source Separation Techniques
