Occam's Razor is Only as Sharp as Your ELBO
Ethan Harvey, Michael C. Hughes

TL;DR
This paper examines how the evidence and ELBO influence model selection, revealing that ELBO can lead to overfitting in over-parameterized models and that reduced-rank assumptions impact model selection.
Contribution
It demonstrates that ELBO-based hyperparameter learning can cause overfitting and highlights the limitations of using ELBO for model selection in large, over-parameterized models.
Findings
ELBO can lead to overfitting depending on the covariance rank.
Bayesian model selection may prefer overfit models over underfit ones.
Reduced-rank assumptions impact the effectiveness of model selection.
Abstract
The marginal likelihood, also known as the evidence, is regarded as a mathematical embodiment of Occam's razor, enabling model selection that avoids overfitting. The evidence lower bound (ELBO) objective from variational inference has also been used for similar purposes. Prior work has shown that restricting the approximate posterior family via a mean-field approximation can lead the ELBO to underfit. In this paper, we show how ELBO-based hyperparameter learning in a simple over-parameterized regression model can also produce overfitting, depending on the assumed rank of the covariance matrix in a Gaussian approximate posterior. Surprisingly, among only the underfit and overfit options, Bayesian model selection via the evidence itself sometimes prefers the overfit version, while the ELBO does not. Bayesian practitioners hoping to scale to large models should be cautious about how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
