Bayesian Double Descent
Nick Polson, Vadim Sokolov

TL;DR
This paper provides a Bayesian interpretation of the double descent phenomenon in over-parameterized models, connecting it with Bayesian model selection and shrinkage methods, and illustrating it with neural networks and non-parametric models.
Contribution
It introduces a Bayesian framework to understand double descent, linking it with classical Bayesian concepts and extending analysis to neural networks and non-parametric models.
Findings
Double descent can be explained through Bayesian model selection.
Bayesian methods reconcile double descent with Occam's razor.
The approach applies to high-dimensional neural networks and infinite Gaussian models.
Abstract
Double descent is a phenomenon of over-parameterized statistical models such as deep neural networks which have a re-descending property in their risk function. As the complexity of the model increases, risk exhibits a U-shaped region due to the traditional bias-variance trade-off, then as the number of parameters equals the number of observations and the model becomes one of interpolation where the risk can be unbounded and finally, in the over-parameterized region, it re-descends -- the double descent effect. Our goal is to show that this has a natural Bayesian interpretation. We also show that this is not in conflict with the traditional Occam's razor -- simpler models are preferred to complex ones, all else being equal. Our theoretical foundations use Bayesian model selection, the Dickey-Savage density ratio, and connect generalized ridge regression and global-local shrinkage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Stochastic Gradient Optimization Techniques · Statistical Methods and Inference
