Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation
Konstantinos Pitas

TL;DR
This paper critically examines the effectiveness of mean-field variational inference in PAC-Bayes bounds for neural networks, finding it offers limited benefits and advocating for richer posterior models.
Contribution
The study empirically demonstrates the limitations of mean-field approximations in PAC-Bayes bounds and suggests exploring more complex posteriors for better generalization guarantees.
Findings
Mean-field VI yields negligible improvements in bounds.
Optimization issues are not the main cause of poor bounds.
Richer posterior models are promising for future research.
Abstract
Explaining how overparametrized neural networks simultaneously achieve low risk and zero empirical risk on benchmark datasets is an open problem. PAC-Bayes bounds optimized using variational inference (VI) have been recently proposed as a promising direction in obtaining non-vacuous bounds. We show empirically that this approach gives negligible gains when modeling the posterior as a Gaussian with diagonal covariance--known as the mean-field approximation. We investigate common explanations, such as the failure of VI due to problems in optimization or choosing a suboptimal prior. Our results suggest that investigating richer posteriors is the most promising direction forward.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis
