Demystify Optimization and Generalization of Over-parameterized PAC-Bayesian Learning
Wei Huang, Chunrui Liu, Yilan Chen, Tianyu Liu, and Richard Yi Da Xu

TL;DR
This paper provides a theoretical analysis of PAC-Bayesian learning for over-parameterized neural networks trained with gradient descent, linking it to kernel ridge regression and offering a new generalization bound and hyperparameter selection method.
Contribution
It introduces a convergence and generalization analysis for PAC-Bayesian neural network training, connecting it to kernel methods and proposing a practical hyperparameter proxy.
Findings
PAC-Bayesian training corresponds to kernel ridge regression with PNTK.
The derived generalization bound improves over Rademacher complexity bounds.
A time-saving hyperparameter selection proxy is proposed.
Abstract
PAC-Bayesian is an analysis framework where the training error can be expressed as the weighted average of the hypotheses in the posterior distribution whilst incorporating the prior knowledge. In addition to being a pure generalization bound analysis tool, PAC-Bayesian bound can also be incorporated into an objective function to train a probabilistic neural network, making them a powerful and relevant framework that can numerically provide a tight generalization bound for supervised learning. For simplicity, we call probabilistic neural network learned using training objectives derived from PAC-Bayesian bounds as {\it PAC-Bayesian learning}. Despite their empirical success, the theoretical analysis of PAC-Bayesian learning for neural networks is rarely explored. This paper proposes a new class of convergence and generalization analysis for PAC-Bayes learning when it is used to train…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Neural Networks and Applications · Advanced Neural Network Applications
